{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "wC2P9rnjUrw1"
   },
   "source": [
    "Clustering is a very common visualization technique in Business Intelligence. In marketing, you'll target people differently, say teens versus pensioners, and some groups are more valuable than others. Often very simple methods such as Principal component analysis (PCA) and k-means are employed for that purpose. In a first step dimensionality is reduced, then groups are separated by applying a clustering algorithm. \n",
    "\n",
    "Personally, I find it often frustrating to see these methods, especially when there are so many better methods. After all, PCA was proposed in 1901 (Karl Pearson, \"On Lines and Planes of Closest Fit to Systems of Points in Space\") and k-means in 1967 (James MacQueen, \"Some Methods for classification and Analysis of Multivariate Observations\"). While both methods had their place when data and computing resources were hard to come by, today many alternatives exist. \n",
    "\n",
    "Both PCA and k-means have serious shortcomings that affect its usefulness in practice. Since PCA operates over the correlation matrix, it can only find linear correlations between datapoints. This means that if variables are related, but not linearly (as you would see in a scatter-plot), then PCA would fail. Further, PCA is based on mean and variance, which are parameters for the Gaussian distribution. K-means, being a centroid based clustering algorithm, can only find spherical groups in a Euclidean space - i.e. if fails to uncover any more complicated structures [many limitations](https://developers.google.com/machine-learning/clustering/algorithm/advantages-disadvantages). Much more can be said about that, however, let's look at more modern ways to solve a typical application for clustering.\n",
    "\n",
    "In this recipe, we'll go through a typical application of marketing segmentation, and we'll use modern, robust, nonlinear methods. In the end, we'll link to even more modern methods."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "JadXAqo4hDv8"
   },
   "source": [
    "We download our [dataset from UCI](http://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/). This dataset includes customer data that can serve to devise different strategy dependent on customer types. \n",
    "\n",
    "The dataset contains demographical and payment information about the customers. Depending on the group, \n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "70Mqvv1xhCgx"
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 204
    },
    "colab_type": "code",
    "id": "BFoUVT0LCnFb",
    "outputId": "1b56ff44-1e31-4521-a989-085074852f44"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--2020-10-11 23:29:37--  http://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/german.data\n",
      "Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.252\n",
      "Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.252|:80... connected.\n",
      "HTTP request sent, awaiting response... 200 OK\n",
      "Length: 79793 (78K) [application/x-httpd-php]\n",
      "Saving to: ‘german.data.1’\n",
      "\n",
      "german.data.1       100%[===================>]  77.92K   165KB/s    in 0.5s    \n",
      "\n",
      "2020-10-11 23:29:38 (165 KB/s) - ‘german.data.1’ saved [79793/79793]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "!wget http://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/german.data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "1ps-viXDbcgL"
   },
   "outputs": [],
   "source": [
    "names = ['existingchecking', 'duration', 'credithistory', 'purpose', 'creditamount', \n",
    "         'savings', 'employmentsince', 'installmentrate', 'statussex', 'otherdebtors', \n",
    "         'residencesince', 'property', 'age', 'otherinstallmentplans', 'housing', \n",
    "         'existingcredits', 'job', 'peopleliable', 'telephone', 'foreignworker', 'classification']\n",
    "\n",
    "customers = pd.read_csv('german.data', names=names, delimiter=' ')\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 224
    },
    "colab_type": "code",
    "id": "vu-US39_jFCA",
    "outputId": "85590fb0-63e4-4e0b-934c-9fb05bf3f55c"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>existingchecking</th>\n",
       "      <th>duration</th>\n",
       "      <th>credithistory</th>\n",
       "      <th>purpose</th>\n",
       "      <th>creditamount</th>\n",
       "      <th>savings</th>\n",
       "      <th>employmentsince</th>\n",
       "      <th>installmentrate</th>\n",
       "      <th>statussex</th>\n",
       "      <th>otherdebtors</th>\n",
       "      <th>...</th>\n",
       "      <th>property</th>\n",
       "      <th>age</th>\n",
       "      <th>otherinstallmentplans</th>\n",
       "      <th>housing</th>\n",
       "      <th>existingcredits</th>\n",
       "      <th>job</th>\n",
       "      <th>peopleliable</th>\n",
       "      <th>telephone</th>\n",
       "      <th>foreignworker</th>\n",
       "      <th>classification</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>A11</td>\n",
       "      <td>6</td>\n",
       "      <td>A34</td>\n",
       "      <td>A43</td>\n",
       "      <td>1169</td>\n",
       "      <td>A65</td>\n",
       "      <td>A75</td>\n",
       "      <td>4</td>\n",
       "      <td>A93</td>\n",
       "      <td>A101</td>\n",
       "      <td>...</td>\n",
       "      <td>A121</td>\n",
       "      <td>67</td>\n",
       "      <td>A143</td>\n",
       "      <td>A152</td>\n",
       "      <td>2</td>\n",
       "      <td>A173</td>\n",
       "      <td>1</td>\n",
       "      <td>A192</td>\n",
       "      <td>A201</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>A12</td>\n",
       "      <td>48</td>\n",
       "      <td>A32</td>\n",
       "      <td>A43</td>\n",
       "      <td>5951</td>\n",
       "      <td>A61</td>\n",
       "      <td>A73</td>\n",
       "      <td>2</td>\n",
       "      <td>A92</td>\n",
       "      <td>A101</td>\n",
       "      <td>...</td>\n",
       "      <td>A121</td>\n",
       "      <td>22</td>\n",
       "      <td>A143</td>\n",
       "      <td>A152</td>\n",
       "      <td>1</td>\n",
       "      <td>A173</td>\n",
       "      <td>1</td>\n",
       "      <td>A191</td>\n",
       "      <td>A201</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>A14</td>\n",
       "      <td>12</td>\n",
       "      <td>A34</td>\n",
       "      <td>A46</td>\n",
       "      <td>2096</td>\n",
       "      <td>A61</td>\n",
       "      <td>A74</td>\n",
       "      <td>2</td>\n",
       "      <td>A93</td>\n",
       "      <td>A101</td>\n",
       "      <td>...</td>\n",
       "      <td>A121</td>\n",
       "      <td>49</td>\n",
       "      <td>A143</td>\n",
       "      <td>A152</td>\n",
       "      <td>1</td>\n",
       "      <td>A172</td>\n",
       "      <td>2</td>\n",
       "      <td>A191</td>\n",
       "      <td>A201</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>A11</td>\n",
       "      <td>42</td>\n",
       "      <td>A32</td>\n",
       "      <td>A42</td>\n",
       "      <td>7882</td>\n",
       "      <td>A61</td>\n",
       "      <td>A74</td>\n",
       "      <td>2</td>\n",
       "      <td>A93</td>\n",
       "      <td>A103</td>\n",
       "      <td>...</td>\n",
       "      <td>A122</td>\n",
       "      <td>45</td>\n",
       "      <td>A143</td>\n",
       "      <td>A153</td>\n",
       "      <td>1</td>\n",
       "      <td>A173</td>\n",
       "      <td>2</td>\n",
       "      <td>A191</td>\n",
       "      <td>A201</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>A11</td>\n",
       "      <td>24</td>\n",
       "      <td>A33</td>\n",
       "      <td>A40</td>\n",
       "      <td>4870</td>\n",
       "      <td>A61</td>\n",
       "      <td>A73</td>\n",
       "      <td>3</td>\n",
       "      <td>A93</td>\n",
       "      <td>A101</td>\n",
       "      <td>...</td>\n",
       "      <td>A124</td>\n",
       "      <td>53</td>\n",
       "      <td>A143</td>\n",
       "      <td>A153</td>\n",
       "      <td>2</td>\n",
       "      <td>A173</td>\n",
       "      <td>2</td>\n",
       "      <td>A191</td>\n",
       "      <td>A201</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 21 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "  existingchecking  duration credithistory purpose  creditamount savings  \\\n",
       "0              A11         6           A34     A43          1169     A65   \n",
       "1              A12        48           A32     A43          5951     A61   \n",
       "2              A14        12           A34     A46          2096     A61   \n",
       "3              A11        42           A32     A42          7882     A61   \n",
       "4              A11        24           A33     A40          4870     A61   \n",
       "\n",
       "  employmentsince  installmentrate statussex otherdebtors  ...  property age  \\\n",
       "0             A75                4       A93         A101  ...      A121  67   \n",
       "1             A73                2       A92         A101  ...      A121  22   \n",
       "2             A74                2       A93         A101  ...      A121  49   \n",
       "3             A74                2       A93         A103  ...      A122  45   \n",
       "4             A73                3       A93         A101  ...      A124  53   \n",
       "\n",
       "   otherinstallmentplans housing existingcredits   job peopleliable  \\\n",
       "0                   A143    A152               2  A173            1   \n",
       "1                   A143    A152               1  A173            1   \n",
       "2                   A143    A152               1  A172            2   \n",
       "3                   A143    A153               1  A173            2   \n",
       "4                   A143    A153               2  A173            2   \n",
       "\n",
       "   telephone foreignworker classification  \n",
       "0       A192          A201              1  \n",
       "1       A191          A201              2  \n",
       "2       A191          A201              1  \n",
       "3       A191          A201              1  \n",
       "4       A191          A201              2  \n",
       "\n",
       "[5 rows x 21 columns]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "customers.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "kwAuikn9EBCt"
   },
   "outputs": [],
   "source": [
    "from sklearn.preprocessing import LabelEncoder\n",
    "\n",
    "catvars = ['existingchecking', 'credithistory', 'purpose', 'savings', 'employmentsince',\n",
    "           'statussex', 'otherdebtors', 'property', 'otherinstallmentplans', 'housing', 'job', \n",
    "           'telephone', 'foreignworker']\n",
    "numvars = ['creditamount', 'duration', 'installmentrate', 'residencesince', 'age', \n",
    "           'existingcredits', 'peopleliable', 'classification']\n",
    "\n",
    "dummyvars = pd.get_dummies(customers[catvars])\n",
    "transactions = pd.concat([customers[numvars], dummyvars], axis = 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 224
    },
    "colab_type": "code",
    "id": "n8Ld_uXzrPJz",
    "outputId": "390e2cf6-b654-4c8c-8f31-58baac7476b5"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>creditamount</th>\n",
       "      <th>duration</th>\n",
       "      <th>installmentrate</th>\n",
       "      <th>residencesince</th>\n",
       "      <th>age</th>\n",
       "      <th>existingcredits</th>\n",
       "      <th>peopleliable</th>\n",
       "      <th>classification</th>\n",
       "      <th>existingchecking_A11</th>\n",
       "      <th>existingchecking_A12</th>\n",
       "      <th>...</th>\n",
       "      <th>housing_A152</th>\n",
       "      <th>housing_A153</th>\n",
       "      <th>job_A171</th>\n",
       "      <th>job_A172</th>\n",
       "      <th>job_A173</th>\n",
       "      <th>job_A174</th>\n",
       "      <th>telephone_A191</th>\n",
       "      <th>telephone_A192</th>\n",
       "      <th>foreignworker_A201</th>\n",
       "      <th>foreignworker_A202</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1169</td>\n",
       "      <td>6</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>67</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>5951</td>\n",
       "      <td>48</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>22</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2096</td>\n",
       "      <td>12</td>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>49</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>7882</td>\n",
       "      <td>42</td>\n",
       "      <td>2</td>\n",
       "      <td>4</td>\n",
       "      <td>45</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4870</td>\n",
       "      <td>24</td>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "      <td>53</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 62 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   creditamount  duration  installmentrate  residencesince  age  \\\n",
       "0          1169         6                4               4   67   \n",
       "1          5951        48                2               2   22   \n",
       "2          2096        12                2               3   49   \n",
       "3          7882        42                2               4   45   \n",
       "4          4870        24                3               4   53   \n",
       "\n",
       "   existingcredits  peopleliable  classification  existingchecking_A11  \\\n",
       "0                2             1               1                     1   \n",
       "1                1             1               2                     0   \n",
       "2                1             2               1                     0   \n",
       "3                1             2               1                     1   \n",
       "4                2             2               2                     1   \n",
       "\n",
       "   existingchecking_A12  ...  housing_A152  housing_A153  job_A171  job_A172  \\\n",
       "0                     0  ...             1             0         0         0   \n",
       "1                     1  ...             1             0         0         0   \n",
       "2                     0  ...             1             0         0         1   \n",
       "3                     0  ...             0             1         0         0   \n",
       "4                     0  ...             0             1         0         0   \n",
       "\n",
       "   job_A173  job_A174  telephone_A191  telephone_A192  foreignworker_A201  \\\n",
       "0         1         0               0               1                   1   \n",
       "1         1         0               1               0                   1   \n",
       "2         0         0               1               0                   1   \n",
       "3         1         0               1               0                   1   \n",
       "4         1         0               1               0                   1   \n",
       "\n",
       "   foreignworker_A202  \n",
       "0                   0  \n",
       "1                   0  \n",
       "2                   0  \n",
       "3                   0  \n",
       "4                   0  \n",
       "\n",
       "[5 rows x 62 columns]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "transactions.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 54
    },
    "colab_type": "code",
    "id": "hKV2o7BcjH5e",
    "outputId": "d0fe50a8-14ab-4af4-a8c8-b1676fd4040c"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'\\nitems_dict = {}\\n\\nfor _, customer in customers.iterrows():\\n    purchases = list(customer.values)\\n    for purchase in purchases:\\n      if purchase not in items_dict:\\n        items_dict[purchase] = len(items_dict)\\n\\ntransactions = np.zeros((len(customers), len(items_dict)))\\n\\nfor customer_index, customer in customers.iterrows():\\n    purchases = list(customer.values)\\n    for purchase in purchases:\\n      transactions[customer_index, items_dict[purchase]] = 1        \\n'"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "'''\n",
    "items_dict = {}\n",
    "\n",
    "for _, customer in customers.iterrows():\n",
    "    purchases = list(customer.values)\n",
    "    for purchase in purchases:\n",
    "      if purchase not in items_dict:\n",
    "        items_dict[purchase] = len(items_dict)\n",
    "\n",
    "transactions = np.zeros((len(customers), len(items_dict)))\n",
    "\n",
    "for customer_index, customer in customers.iterrows():\n",
    "    purchases = list(customer.values)\n",
    "    for purchase in purchases:\n",
    "      transactions[customer_index, items_dict[purchase]] = 1        \n",
    "'''"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 296
    },
    "colab_type": "code",
    "id": "R19vxGAY1j3g",
    "outputId": "e443aec1-9e18-4b6f-f8ac-65b2120089fa"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Text(0.5, 0, 'dimensions')"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3dd3xUdbrH8c9DR3rvXUCQJoZiR7FgWctasawFu65ld23XXdera9m9rnvtig11sWFbFrEiiooKCSi9hB5a6C0kpDz3jzN4RxaSE2ByMpnv+/XKKzPnTPkeHfLM+Z1fMXdHRERSV6WoA4iISLRUCEREUpwKgYhIilMhEBFJcSoEIiIprkrUAUqrcePG3r59+6hjiIgklYyMjLXu3mR3+5KuELRv35709PSoY4iIJBUzW7KnfWoaEhFJcSoEIiIpToVARCTFqRCIiKQ4FQIRkRSXsEJgZi+ZWbaZzdjDfjOzx80s08ymmVnfRGUREZE9S+QZwQhgSDH7TwY6x36uBp5JYBYREdmDhI0jcPcJZta+mIecAbzqwTzY35tZfTNr4e4rE5VJRJJHQWERS9bnkJm9lYVrtrF9R0HUkSI3uFszerepv99fN8oBZa2AZXH3s2Lb/qMQmNnVBGcNtG3btkzCiUjZKCxylq3PYc6qLcxdtYW5qzczf/VWFq/bRn7h/6+XYhZhyHKiad0aFa4QhObuw4HhAGlpaVpJRySJFRU5s1ZuZvycbL6at4YZKzaRm18EBH/s2zY8gM5N6zC4WzMObFqbA5vWplOTWtSpUTXi5BVXlIVgOdAm7n7r2DYRqWBydhQwYd4axs3O5st5a1izJQ+AXq3rMbR/Ww5qXoeuzevSpVltDqiWFN9PK5Qo/4uPBm40szeBAcAmXR8QqTi25hXwxZxsPpq+kvFzs8nNL6JujSoc3aUJx3ZtytFdmtCkTvWoYwoJLARm9gYwCGhsZlnAn4GqAO7+LDAWOAXIBHKAyxOVRUTKxo6CIr6Yk817U7L4ct4adhQU0aROdc49tA0n92xO//YNqVJZw5fKm0T2Ghpawn4HbkjU+4tI2XB3ZizfzDsZyxj90wo25OTTuHZ1LuzfllN6tuDQdg2oXElXesszNcaJyF5ZuzWPD6Yu5+30ZcxbvZVqVSpxQvdmnNO3NUd1bqxv/klEhUBEQssvLOLLuWsYlb6ML+ZkU1DkHNK2Pg+c1YPTerak3gHq2ZOMVAhEpEQrN23njUnLeHPSUrK35NG4dnWGHdmBc9Nac2DTOlHHk32kQiAiu+XuTFywjte+W8Jns1dT5M6gLk34S/+2HHtQU6qq6afCUCEQkV/IzS/kg6nLeeGbRWRmb6XBAVW58qgOXNS/HW0bHRB1PEkAFQIRAWBTTj7//GEJL3+7mLVb8+jeoi5/P7c3p/ZqQY2qlaOOJwmkQiCS4lZtymX4hIW8OXkpOTsKObpLE645uiOHd2qEaYKflKBCIJKisjbk8MyXCxiVnkWhO2f0bslVR3ekW4u6UUeTMqZCIJJiFq/dxtNfZvLelOWYwblpbbjumE60aaj2/1SlQiCSIpas28YTX2Ty/tTlVKlkXDywHdcc05EW9WpGHU0iVmIhMLOmwBFAS2A7MANId/eiBGcTkf1g6bocnhw/n3enBAXg0sPac+2gjjStUyPqaFJO7LEQmNmxwJ1AQ2AqkA3UAM4EOpnZO8Df3X1zWQQVkdLJ2pDDk19k8k5GFpUqGZcMbMf1gzrRtK4KgPxScWcEpwBXufvSXXeYWRXgNOAE4N0EZRORvbBqUy5Pjp/PW5OXYRgXDWjLdYMOpHk9FQDZvT0WAne/rZh9BcAHCUkkInsle0suz3y5gJE/LMXdOS+tDTcceyAt6+sagBQvzDWCZsCDQCt3H2Jm3YHD3P3FhKcTkRJtysnn2QkLePnbReQXOmf3bcVvj+usXkASWpheQyOAl4G7Y/fnAW8BKgQiEcrZUcDL3y7m2a8WsDWvgNN7t+SW47vQoXGtqKNJkglTCBq7+9tmdhcEzUJmVpjgXCKyBzsKinhz8lIeH5fJ2q15HN+tKb8/sasGgsleC1MItplZI8ABzGwgsCmhqUTkPxQVOWOmr+SRT+aydH0OAzo05LlL+nJou4ZRR5MkF6YQ/J5goflOZvYt0AQ4J6GpROQXvp6/hoc/msPMFZs5qHkdRlzej2O6NNFcQLJflFgI3D3DzI4BugIGzHX3/IQnExFmrtjEQ2Pn8E3mWlrVr8k/zu/NGb1bUUlrAMt+FKbX0DTgTeAtd1+Q+EgisnLTdh75ZB7vTc2iXs2q/Om07lw8sC3Vq2g6aNn/wjQN/Qo4H3jbzIoIegy9vbuBZiKyb7bk5vPcVwt5/uuFuMPVR3Xk+mMPpF5NrQUsiROmaWgJ8Dfgb2bWGfgT8FdAX01E9pPCImdU+jIe+XQua7fu4Iw+LfnDiV01FkDKRKjZR82sHcFZwflAIXB7IkOJpJKJC9Zy/5jZzF65mbR2DXjx0n70blM/6liSQsJcI/gBqAqMAs5194UJTyWSApas28aDY2fzyczVtKpfkycvPIRTe7ZQTyApc2HOCH7j7nMTnkQkRewoKGL4hAU8/kUmVSoZt53UlWFHdtC6wBKZ4qahvtjd/wmcaman7rrf3R9NaDKRCih98Xruem8687O3cmrPFtzzq+4007TQErHizgh2TlhSZzf7PAFZRCqsTdvzefijObwxaSmt6tfkxUvTGNytWdSxRIDip6F+Lnbzc3f/Nn6fmR2R0FQiFYS7M3b6Ku7990zWbc3jqqM6cMvxXahVXavESvkR5tP4BNA3xDYRibNi43b+9MEMxs3Jpmererx8WT96tKoXdSyR/1DcNYLDgMOBJmb2u7hdddEYApE9KixyXv1uMY98Mpcihz+e2o3LDm9PlcqVoo4mslvFnRFUA2rHHhN/nWAzmnROZLfmrNrMHe9O56dlGxnUtQn3n9FDg8Kk3CvuGsFXwFdmNiI2ulhE9iCvoJAnv8jkmS8XUK9mVR67oA+n926pMQGSFMJcI8gxs/8BDgZ+7ufm7seV9EQzGwI8RtCU9IK7P7zL/rbAK0D92GPudPex4eOLRC9jyXrueHc6mdlb+fUhrfjjad1pWKta1LFEQgtTCEYSTDR3GnAtcCmwpqQnmVll4CngBCALmGxmo919VtzD/kgwgd0zsbWQxwLtS3UEIhHZllfA3z6ew6vfL6FlvZqMuLwfg7o2jTqWSKmFKQSN3P1FM7s5rrlocojn9Qcyd05JYWZvAmcA8YXACS4+A9QDVoSPLhKdbzPXcse701i+cTuXHtaeP5zUldrqEipJKswnd+ciNCtjI4xXAGHWxmsFLIu7nwUM2OUx9wKfmtlvCQawHR/idUUiszk3n4fGBgPDOjauxdvXHEa/9loqUpJbmELwFzOrR7Bk5RME3+Bv3U/vPxQY4e5/j3VXfc3Merh7UfyDzOxq4GqAtm3b7qe3FimdL+dmc9d701m9OZdrju7IrSd00fxAUiGEWY9gTOzmJuDYUrz2cqBN3P3WsW3xhgFDYu/znZnVABoD2btkGA4MB0hLS9P0FlKmtuTm85cxs3krfRmdm9bmmeuPoI+miZYKpLgBZU9QzJxC7n5TCa89GehsZh0ICsAFwIW7PGYpMBgYYWbdCHollXghWqSsfDN/Lbe/8xOrNudy3aBO3HJ8Zy0XKRVOcWcE6fvywu5eYGY3Ap8QdA19yd1nmtl9QLq7jyZobnrezG4lKDqXubu+8UvktuUV8NBHs/nn90vp2KQW71x3OH3bNog6lkhCWLL93U1LS/P09H2qUSLFmrx4Pb9/+yeWbchh2BEd+MNJXXUtQJKemWW4e9ru9oVZoWw8u2kiCjOgTCSZ5BUU8o/P5vPchAW0blCTt64+jP4d1CNIKr4wvYb+EHe7BnA2UJCYOCLRmL1yM7e+9SNzVm1haP823H1qd40LkJQRptdQxi6bvjWzSQnKI1KmCoucF75eyN8/nUfdmlV56bI0jjtIC8ZIagnTNBR/blwJOJRgFLBIUlu2Poffj/qJSYvWc3KP5jxwVk/NESQpKcy5bwbBNQIjaBJaRND/XyQpuTvvTlnOvaNnYsCj5/XmrENaaaZQSVlhmoY6lEUQkbKwftsO/uu96Xw8cxX9OzTk0fN607qB1guQ1BamaagycCrBrKA/P97dH01cLJH9b8K8Nfx+1E9sysnnv045iGFHdqRyJZ0FiIRpGvo3kAtMB4pKeKxIuZObX8jfPp7LS98uonPT2rxyeX+6t6xb8hNFUkSYQtDa3XslPIlIAsxZtZlb3gy6hV52eHvuPPkgDQ4T2UWYQvCRmZ3o7p8mPI3IfuLujJi4mIc+mkPdGlV5+fJ+HKtFY0R2K0wh+B5438wqEaxNYIC7u86tpVxatzWP296Zxhdzshl8UFP+ek4vGteuHnUskXIrTCF4FDgMmK4J4aS8+/mC8PZ87jvjYC4Z2E7dQkVKEKYQLANmqAhIebajoIhHPp3L8AkL6dKsNq8N689BzXXSKhJGmEKwEPjSzD4C8nZuVPdRKS+WrNvGb9+YyrSsTVw8sC1/PLW7LgiLlEKYQrAo9lMt9iNSboyZtoI7351OJYPnLjmUkw5uHnUkkaQTZmTxf5dFEJHSyM0v5L4xs3j9h6X0bVufx4ceohHCInspzMjiNOBuoB2/HFmssQUSiczsrdz4+hTmrNrCdYM68bsTulC1cqWoY4kkrTBNQyOB29DIYikHRv+0gjvfnUbNqpUZcXk/BmlsgMg+C1MI1sTWFxaJTF5BIQ98OJtXv1tCWrsGPHlhX5rXqxF1LJEKIUwh+LOZvQCM45e9ht5LWCqROFkbcrjh9an8tGwjVx7ZgTtOPkhNQSL7UZhCcDlwEFCV/28ackCFQBLuy7nZ3PLWjxQWOs9e3JchPVpEHUmkwglTCPq5e9eEJxGJU1TkPDZuPo9/MZ+uzerwzMWH0qFxrahjiVRIYQrBRDPr7u6zEp5GBNiwbQe3vPUjX81bw6/7tuKBM3tSs5oGiIkkSphCMBD40cwWEVwj2DnpnLqPyn7307KNXD9yCmu25PHgWT0Z2r+N5goSSbAwhWBIwlNIynN33pi0jHtHz6RJneqMuvYwerepH3UskZQQphBosjlJqNz8Qv70wQxGZWRxdJcmPHZ+HxrU0mwmImUlTCH4kKAYGFAD6ADMBQ5OYC5JEcvW53DdyAxmLN/MTccdyM3Hd9E6wiJlLMxcQz3j75tZX+D6hCWSlPHVvDXc/OZUCoucF36TxvHdm0UdSSQlhTkj+AV3n2JmAxIRRlKDu/P0lwt45NO5dG1Wh2cvPpT26hoqEpkwk879Lu5uJaAvsCJhiaRC25pXwB/e/omPZ67i9N4tefjsnhxQrdTfR0RkPwrzL7BO3O0CgmsG7yYmjlRki9Zu4+pX01mwZit/PLUbw47soK6hIuWA1iOQMjF+bjY3vTGVypWMV68YwJGdG0cdSURiSpy5y8w+M7P6cfcbmNkniY0lFUVwPSCTK0ZMpnWDA/j3jUeqCIiUM2Gahpq4+8add9x9g5lpEngpUW5+Ibe/M43RP63gtF4t+Ns5vXQ9QKQcCjOXb6GZtd15x8zaEXKQmZkNMbO5ZpZpZnfu4THnmdksM5tpZq+Hiy3l3cpN2zn32e/497QV3HZSV54YeoiKgEg5FeZf5t3AN2b2FcGgsqOAq0t6kplVBp4CTgCygMlmNjp+8joz6wzcBRyhM42KY8rSDVzzWgY5eQU8f4nGB4iUd2EuFn8cG0Q2MLbpFndfG+K1+wOZ7r4QwMzeBM4A4mcxvQp4yt03xN4ruzThpfx5b0oWd747neb1ajDyygF0aVan5CeJSKT2WAjMrL27LwaI/eEfs8t+A1q5e9YeXqIVsCzufhaw60C0LrHX+haoDNzr7h/vJsvVxM5C2rZtu+tuKQfcnX98No/Hv8jksI6NePqivpovSCRJFHdG8D9mVgn4F5ABrCGYa+hA4FhgMPBngj/w+/L+nYFBQGtggpn1jL84DeDuw4HhAGlpaZoEr5yJvyh8flob/nJWDy0lKZJE9lgI3P1cM+sOXARcAbQAcoDZwFjgAXfPLea1lwNt4u63jm2LlwX84O75wCIzm0dQGCaX9kAkGuu25nH1axlkLNnAHUMO4tpjOmqQmEiSKfYaQezC7t17+dqTgc5m1oGgAFwAXLjLYz4AhgIvm1ljgqaihXv5flLGMrO3csWIyazenMvTF/XllJ5aT1gkGSWsP5+7F5jZjcAnBO3/L7n7TDO7D0h399GxfSea2SygELjN3dclKpPsPxMXrOXa1zKoVqUSb149kEPaNog6kojsJXNPrib3tLQ0T09PjzpGSnsnI4u73ptG+0a1eOmyfrRpeEDUkUSkBGaW4e5pu9unET4Smrvzj8/n8/i4+RxxYCOevuhQ6tWsGnUsEdlHYaahNoILxh3d/b7YKOPm7j4p4emk3MgrCHoG/evHFZx7aGseOKsn1aqoZ5BIRRDmjOBpoAg4DrgP2EIwDXW/BOaScmRjzg6ufi2DSYvWc9tJXbl+UCf1DBKpQMIUggHu3tfMpsLPk85ppFCKWLY+h0tfnkTW+u08dkEfzujTKupIIrKfhSkE+bF5gxzAzJoQnCFIBffTso0Me2UyOwqKeG1YfwZ0bBR1JBFJgDCNvI8D7wNNzewB4BvgwYSmksh9Nms1Fwz/nhpVK/Pe9YerCIhUYGEmnRtpZhkEU0oYcKa7z054MonMP79fwj3/msHBLevx4mVpNK1TI+pIIpJAYXoNDQRmuvtTsft1zWyAu/+Q8HRSptydp8Zn8sin8zi2axOeuqiv1hAQSQFhmoaeAbbG3d8a2yYViLvzwIezeeTTeZzZpyXDf5OmIiCSIsL8SzePG37s7kVmpr8QFUhBYRF3vTedURlZXHpYO/78q4OpVEndQ0VSRZgzgoVmdpOZVY393IwmhqswcvMLueH1KYzKyOKmwZ2593QVAZFUE6YQXAscTjCD6M7FZUpcqlLKv215BQx7ZTKfzFzNPad153cndNFAMZEUFKbXUDbBFNJSgWzans/lL0/ix2UbeeTc3pxzaOuoI4lIRML0GmpCsLZw+/jHu/sViYslibR2ax6XvDiJzOwtPH1RX4b00DoCIqkszEXffwFfA58TrBkgSWzFxu1c/OIPrNi4nRcu7ccxXZpEHUlEIhamEBzg7nckPIkk3JJ127jw+R/YvD2f14YNoF/7hlFHEpFyIMzF4jFmdkrCk0hCLVm3jfOf+56cHQW8ftVAFQER+VmYQnAzQTHYbmabzWyLmW1OdDDZf5atz2Ho8O/JLSjk9asG0rN1vagjiUg5EqbXUJ2yCCKJkbUhh6HPf8+2HYWMvHIA3VrUjTqSiJQzoUYIm1kDoDPw8+xj7j4hUaFk/1ixcTsXPv8Dm7bn8/qVA+nRSmcCIvKfwnQfvZKgeag18CMwEPiOYMUyKadWbcrlwue/Z8O2Hbx25QA1B4nIHoW9RtAPWOLuxwKHABsTmkr2yZoteVz4wves3bqDV4b1p0+b+lFHEpFyLEwhyHX3XAAzq+7uc4CuiY0le2tjzg4uefEHVm7M5eXL+9G3bYOoI4lIORfmGkGWmdUHPgA+M7MNwJLExpK9sTk3n9+8NImFa7fx0qX91EVUREIJ02vorNjNe81sPFAP+DihqaTUcnYUcMXLk5m1YjPPXXIoR3ZuHHUkEUkSeywEZlbX3TebWfzXyumx37WB9QlNJqHl5hdy1avpTFm6gSeG9mVwt2ZRRxKRJFLcGcHrwGlABuAE6xXH/+6Y8HRSovzCIm58fQrfZq7j7+f25tRemkBOREpnj4XA3U+zYHL6Y9x9aRlmkpCKipzb35nG57Ozuf/MHpytqaRFZC8U22sotkTlh2WURUrB3blvzCzen7qc207qyiUD20UdSUSSVJjuo1PMrF/Ck0ip/O/n8xkxcTFXHdWB6wd1ijqOiCSxMN1HBwAXmdkSYBuxawTu3iuhyWSPXvpmEY+Nm895aa35r1O6aXlJEdknYQrBSQlPIaG9m5HFfWNmMeTg5jx4Vk8VARHZZ2HGESwBMLOmxE06J2Vv3OzV3P7uNI44sBGPDe1DlcphWvZERIpX4l8SMzvdzOYDi4CvgMXAR2Fe3MyGmNlcM8s0szuLedzZZuZmlhYyd8qZvHg914+cQo+WdXnukjSqV6kcdSQRqSDCfKW8n2DG0Xnu3gEYDHxf0pPMrDLwFHAy0B0Yambdd/O4OgQT2/1QitwpZfbKzVwxYjKtGtTkpcv6Ubt6qNnDRURCCVMI8t19HVDJzCq5+3ggzDf3/kCmuy909x3Am8AZu3nc/cBfgdywoVPJsvU5/OalSdSqVoXXhg2gUe3qUUcSkQomTCHYaGa1gQnASDN7jKD3UElaAcvi7mfFtv3MzPoCbdy92LEKZna1maWbWfqaNWtCvHXFsGZLHpe8+AP5hUW8Nqw/rerXjDqSiFRAYQrBGUAOcCvBZHMLgF/t6xubWSXgUeD3JT3W3Ye7e5q7pzVp0mRf3zop5BUUcvVr6azenMdLl/WjczOtGCoiiRGmsfka4C13Xw68UorXXg60ibvfOrZtpzpAD+DLWBfI5sBoMzvd3dNL8T4Vjrvzx/dnMHXpRp69uK/WFBCRhApzRlAH+NTMvjazG80s7NSWk4HOZtbBzKoBFwCjd+50903u3tjd27t7e4IL0ClfBABembiYURlZ3DS4M0N6aBI5EUmsEguBu/+3ux8M3AC0AL4ys89DPK8AuBH4BJgNvO3uM83sPjM7fR9zV1gTF6zl/g9nc0L3ZtwyuHPUcUQkBZSmH2I2sApYBzQN8wR3HwuM3WXbPXt47KBSZKmQlq3P4YaRU+jQuBaPntebSpU0alhEEi/MgLLrzexLYBzQCLhK8wztfzk7Crjq1XQKi5znf5NGnRpVo44kIikizBlBG+AWd/8x0WFSlXuwrsC81Vt4+fL+dGhcK+pIIpJCwsw1dFdZBEllL36ziDHTVnL7kK4c0yU1useKSPmhWcsi9v3CdTz00RxOOrgZ1x2jdQVEpOypEERo1aZcbnx9Cu0aHcAj5/bWlNIiEgkVgojsKCji+pEZ5Owo5LmLD9XFYRGJzB6vEZjZFsD3tN/d6yYkUYr4y4ezmLJ0I09d2FfTR4hIpPZYCNy9DoCZ3Q+sBF4jWKbyIoKBZbKXPpi6nFe/W8JVR3Xg1F76Tyki0QrTNHS6uz/t7lvcfbO7P8Pup5OWEBau2cp/vT+d/u0bcseQg6KOIyISqhBsM7OLzKyymVUys4sINw217CI3v5AbX59K9SqVtNSkiJQbYf4SXQicB6yO/Zwb2yal9PBHc5i1cjOPnNubFvW0toCIlA9hBpQtRk1B++yTmasYMXExw47swOBuYSdwFRFJvDBzDXUxs3FmNiN2v5eZ/THx0SqO5Ru3c/s70+jVup6uC4hIuROmaeh54C4gH8DdpxGsLSAhFBQWcdMbUykscp4YegjVqui6gIiUL2H+Kh3g7pN22VaQiDAV0WPj5pOxZAMP/ron7RppMjkRKX/CFIK1ZtaJ2OAyMzuHYFyBlOD7het4cnwm56W15vTeLaOOIyKyW2Gmob4BGA4cZGbLgUXAxQlNVQFsysnn1rd+pH2jWvz5VwdHHUdEZI/C9BpaCBxvZrWASu6+JfGxkpu7c9f701izJY/3rj+cWtVLsxCciEjZKvEvlJlVB84G2gNVds6Q6e73JTRZEhuVnsXY6au4Y8hB9GpdP+o4IiLFCvNV9V/AJiADyEtsnOS3aO027v33TA7r2Ihrju4YdRwRkRKFKQSt3X1IwpNUADsKirj5zalUq1KJR8/X4vMikhzC9BqaaGY9E56kAvjH5/OYlrWJh3/dS1NIiEjSCHNGcCRwmZktImgaMsDdvVdCkyWZyYvX8+xXCzg/rQ1DejSPOo6ISGhhCsHJCU+R5LbkBl1F2zQ4gD/9qnvUcURESqW4FcrquvtmQN1FS3D/mFms2Lidt685jNrqKioiSaa4v1qvA6cR9BZygiahnRxQlxiCWUXfTs/ihmM7kda+YdRxRERKrbilKk+L/e5QdnGSy5otedz13nQOblmXmwd3iTqOiMheCdWOYWYNgM5AjZ3b3H1CokIlA3fnznensTWvgP89v49mFRWRpBVmZPGVwM1Aa+BHYCDwHXBcYqOVb29OXsa4Odncc1p3OjerE3UcEZG9FuZr7M1AP2CJux8LHAJsTGiqci5rQw5/GTOLwzs14rLD20cdR0Rkn4QpBLnungvBvEPuPgfomthY5VfQJDQdgL+e3Uujh0Uk6YW5RpBlZvWBD4DPzGwDsCSxscqv1yct5ZvMtfzlzB60aXhA1HFERPZZmGmoz4rdvNfMxgP1gI8TmqqcWrY+hwc/nM0RBzbiogFto44jIrJfFDegbHed4qfHftcG1ickUTlVVOTc8e40IGgS2jkdt4hIsivujGB3A8l2CjWgzMyGAI8BlYEX3P3hXfb/DriSYA3kNcAV7l4um51GTlrKxAXrePCsnrRuoCYhEak4ihtQtk8DycysMvAUcAKQBUw2s9HuPivuYVOBNHfPMbPrgL8B5+/L+ybCsvU5PDR2Nkd1bszQ/m2ijiMisl+FHVD2a4JZSB342t0/CPG0/kBmbKlLzOxN4Azg50Lg7uPjHv895XQt5Hv+NQMDHlaTkIhUQCV2HzWzp4FrCa4PzACuNbOnQrx2K2BZ3P2s2LY9GQZ8tIcMV5tZupmlr1mzJsRb7z/jZq9m/Nw13Hx8Z1rV1xoDIlLxhDkjOA7o5u4OYGavADP3ZwgzuxhIA47Z3X53Hw4MB0hLS/P9+d7Fyc0v5L4xs+jUpBaXHa4pl0SkYgozoCwTiO8r2Sa2rSTLY4/dqXVs2y+Y2fHA3cDp7l6u1kR+4euFLFmXw72nH6y5hESkwgpzRlAHmG1mkwiuEfQH0s1sNIC7n76H500GOptZB4ICcAFwYfwDzOwQ4DlgiLtn790hJMaKjdt5avwChhzcnKM6N4k6johIwoQpBPfszQu7e4GZ3Qh8QtB99CV3n2lm9wHp7psIP7sAAAmRSURBVD4a+B+CMQmjYhdhlxZTWMrUA2NnU+TO3ad2izqKiEhChSkEa3bp8omZDXL3L0t6oruPBcbusu2euNvHh8xZpiZmruXDaSu59fgumkZCRCq8MA3fb5vZ7RaoaWZPAA8lOlhU8guLuPffM2ndoCbXHKNF2ESk4gtTCAYQXCyeSNDuvwI4IpGhovT6D0uZt3or95zWnRpVK0cdR0Qk4cIUgnxgO1CTYIWyRe5elNBUEcnZUcATX2QysGNDTujeLOo4IiJlIkwhmExQCPoBRwFDzWxUQlNF5JWJS1i7NY8/nNhVI4hFJGWEuVg8zN3TY7dXAmeY2SUJzBSJzbn5PPvVAgZ1bUJa+91NvCoiUjGFOSPIMLOLzeweADNrC8xNbKyy9+LXi9i0PZ8/nJiyi6+JSIoKUwieBg4DhsbubyGYVbTC2LBtBy9+s4iTezSnR6t6UccRESlTYZqGBrh7XzObCuDuG8ysWoJzlalnJyxg244Cbj2hS9RRRETKXKheQ7G1BXZOOtcEqDC9hrI35/LKxMWc2acVXZrViTqOiEiZC1MIHgfeB5qa2QPAN8CDCU1Vhp4an0l+oXPz4M5RRxERiUSYxetHmlkGMJhg2coz3X12wpOVgawNObw+aSnnpbWmfeNaUccREYlEqBXK3H0OMCfBWcrcM18uwDB+e5zOBkQkdaXsJPvZm3MZlZ7F2Ye2pqVWHhORFJayheDFbxZRUFTEtZpYTkRSXEoWgo05O/jn90s4rVdL2jXStQERSW0pWQhembiEbTsKuW5Qp6ijiIhELuUKwba8Al6euIjjuzWlW4u6UccREYlcyhWCNyYtZWNOPtcfe2DUUUREyoWUKgR5BYU8//VCBnZsSN+2DaKOIyJSLqRUIXh/ynJWb87jBp0NiIj8LGUKQUFhEc98tYBeretx5IGNo44jIlJupEwhGDtjFUvW5XD9oE5afUxEJE7KFIJa1SpzYvdmnNi9edRRRETKlVBzDVUEg7s1Y3A3LUgvIrKrlDkjEBGR3VMhEBFJcSoEIiIpToVARCTFqRCIiKQ4FQIRkRSnQiAikuJUCEREUpy5e9QZSsXM1gBL9vLpjYG1+zFOVHQc5YuOo3zRcexeO3dvsrsdSVcI9oWZpbt7WtQ59pWOo3zRcZQvOo7SU9OQiEiKUyEQEUlxqVYIhkcdYD/RcZQvOo7yRcdRSil1jUBERP5Tqp0RiIjILlQIRERSXMoUAjMbYmZzzSzTzO6MOk9YZvaSmWWb2Yy4bQ3N7DMzmx/73SDKjGGYWRszG29ms8xsppndHNueNMdiZjXMbJKZ/RQ7hv+Obe9gZj/EPltvmVm1qLOGYWaVzWyqmY2J3U+64zCzxWY23cx+NLP02Lak+UztZGb1zewdM5tjZrPN7LCyPI6UKARmVhl4CjgZ6A4MNbPu0aYKbQQwZJdtdwLj3L0zMC52v7wrAH7v7t2BgcANsf8HyXQsecBx7t4b6AMMMbOBwF+Bf7j7gcAGYFiEGUvjZmB23P1kPY5j3b1PXJ/7ZPpM7fQY8LG7HwT0Jvj/UnbH4e4V/gc4DPgk7v5dwF1R5ypF/vbAjLj7c4EWsdstgLlRZ9yLY/oXcEKyHgtwADAFGEAw+rNKbPsvPmvl9QdoHfvjchwwBrAkPY7FQONdtiXVZwqoBywi1nkniuNIiTMCoBWwLO5+Vmxbsmrm7itjt1cBSbUYs5m1Bw4BfiDJjiXWnPIjkA18BiwANrp7QewhyfLZ+l/gdqAodr8RyXkcDnxqZhlmdnVsW1J9poAOwBrg5VhT3QtmVosyPI5UKQQVlgdfF5KmD7CZ1QbeBW5x983x+5LhWNy90N37EHyj7g8cFHGkUjOz04Bsd8+IOst+cKS79yVo9r3BzI6O35kMnymgCtAXeMbdDwG2sUszUKKPI1UKwXKgTdz91rFtyWq1mbUAiP3OjjhPKGZWlaAIjHT392Kbk/JY3H0jMJ6gCaW+mVWJ7UqGz9YRwOlmthh4k6B56DGS7zhw9+Wx39nA+wTFOdk+U1lAlrv/ELv/DkFhKLPjSJVCMBnoHOsVUQ24ABgdcaZ9MRq4NHb7UoL29nLNzAx4EZjt7o/G7UqaYzGzJmZWP3a7JsE1jtkEBeGc2MPK9TEAuPtd7t7a3dsT/Fv4wt0vIsmOw8xqmVmdnbeBE4EZJNFnCsDdVwHLzKxrbNNgYBZleRxRXygpwwsypwDzCNp07446TylyvwGsBPIJvjkMI2jPHQfMBz4HGkadM8RxHElwajsN+DH2c0oyHQvQC5gaO4YZwD2x7R2BSUAmMAqoHnXWUhzTIGBMMh5HLO9PsZ+ZO/9dJ9NnKu5Y+gDpsc/WB0CDsjwOTTEhIpLiUqVpSERE9kCFQEQkxakQiIikOBUCEZEUp0IgIpLiqpT8EJGKwczuBbYCdYEJ7v55Gb//6UB3d3+4LN9XpCTqPiopY2chcPdHos4iUp6oaUgqNDO728zmmdk3QNfYthFmdk7s9mIze2jnfPZm1tfMPjGzBWZ2bdzr3GZmk81sWtw6BO1jc8c/H1uf4NPYiGPM7KbY2gvTzOzN2LbLzOzJuOd+Eds/zszaxmV73MwmmtnCuJwtzGxCLOcMMzuqDP8zSgWnQiAVlpkdSjCFQh+CUcz99vDQpR5MJPc1wfoP5xCsmbDzD/6JQGeCeWz6AIfGTW7WGXjK3Q8GNgJnx7bfCRzi7r2AnwtKnCeAV2L7RwKPx+1rQTAS+zRgZzPShQTTQvchmK/+x3D/FURKpmsEUpEdBbzv7jkAZran+aV2bp8O1Hb3LcAWM8uLzS10YuxnauxxtQkKwFJgkbvv/KOcQbB2BARTBYw0sw8IpgzY1WHAr2O3XwP+FrfvA3cvAmaZ2c6phycDL8Um7vsg7j1F9pnOCESClccgmJs/L257EcGXJQMe8mAVrD7ufqC7v7jLcwEK+f8vV6cSrIrXF5gcN6tnafIQe2/cfQJwNMGMoCPM7DeleD2RYqkQSEU2ATjTzGrGZqn81V6+zifAFbG1FDCzVmbWdE8PNrNKQBt3Hw/cQbACVe1dHjaRoNkK4CKCZqk9MrN2wGp3fx54gaDAiOwXahqSCsvdp5jZWwSzU2YTNK/szet8ambdgO+C2bTZClxMcAawO5WBf5pZPYJv9I+7+8bYc3f6LcGKVLcRrE51eQkxBgG3mVl+7P11RiD7jbqPioikODUNiYikOBUCEZEUp0IgIpLiVAhERFKcCoGISIpTIRARSXEqBCIiKe7/AJlPVmkTLfwOAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "from sklearn.preprocessing import StandardScaler\n",
    "from sklearn.decomposition import PCA, FastICA\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "pca = PCA().fit(\n",
    "    StandardScaler().fit_transform(transactions)\n",
    ")\n",
    "ax = plt.plot(\n",
    "    range(len(pca.explained_variance_ratio_)), \n",
    "    np.cumsum(pca.explained_variance_ratio_)\n",
    ")\n",
    "plt.ylabel('explained variance (cummulative)')\n",
    "plt.xlabel('dimensions')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "id": "5w2RICX_FLYP",
    "outputId": "b9cf96e1-393d-4215-cc87-e5fd56d7a01e"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "62"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(transactions.columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "r_cMYGVfvS6n"
   },
   "outputs": [],
   "source": [
    "#let's find a good dimensionality reduction\n",
    "from sklearn.manifold import LocallyLinearEmbedding\n",
    "\n",
    "dimensions_range = [10, 20]\n",
    "reconstruction_errors = np.zeros((dimensions_range[1]))\n",
    "for n_components in range(*dimensions_range):\n",
    "  lle = LocallyLinearEmbedding(\n",
    "      n_components=n_components, n_jobs=-1, #eigen_solver='auto', n_neighbors=2\n",
    "  ).fit(transactions)\n",
    "  reconstruction_errors[n_components] = lle.reconstruction_error_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 296
    },
    "colab_type": "code",
    "id": "xuhh0Ko6y3zJ",
    "outputId": "96adc8fb-07a7-4de0-c97a-017322fd9de5"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Text(0.5, 0, 'dimensions')"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAERCAYAAAB2CKBkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3deZxcVZ338c83nX1PSAiQHQhLUEJCs4iyjAvbKFEGZRWQJeKIy4w6wsvnQR90xlFnXBBc2ATcADdsIYgQUFBACZ0FSIA0gXT2tTsrWbr79/xRt7FsutM3S9Wt6vq+X6969V1O3fvLTXX9+p5zzzmKCMzMrHJ1yzoAMzPLlhOBmVmFcyIwM6twTgRmZhXOicDMrMI5EZiZVbiyTASSbpe0StLze+l4YyT9QdJ8SfMkjdsbxzUzKwdlmQiAO4DT9+Lx7gK+ERGHA8cCq/bisc3MSlpZJoKIeBxYl79N0kGSfi/pWUlPSDoszbEkTQS6R8TDybE3RcSWvR+1mVlpKstE0IGbgU9ExNHAZ4HvpXzfIUCjpF9LmiXpG5KqChalmVmJ6Z51AHuDpP7ACcAvJLVu7pXsOxu4vp23LY2I08hdgxOByUA9cA9wKXBbYaM2MysNXSIRkLuzaYyIo9ruiIhfA7/eyXuXALMjYiGApPuA43EiMLMK0SWqhiJiA/CqpA8CKGdSyrc/AwyWNDxZfycwrwBhmpmVpLJMBJJ+DjwFHCppiaTLgQuByyXNAV4ApqY5VkQ0k2tTmCHpOUDALYWJ3Mys9MjDUJuZVbayvCMwM7O9p+wai4cNGxbjxo3LOgwzs7Ly7LPPromI4e3tK7tEMG7cOGbOnJl1GGZmZUXSoo72uWrIzKzCORGYmVU4JwIzswrnRGBmVuGcCMzMKpwTgZlZhXMiMDOrcE4EZmZl4NuPvMxf6tYU5NhOBGZmJW7D1h18Z8YCZr7WUJDjOxGYmZW42fWNRMCUsYMLcnwnAjOzEldb34AER412IjAzq0i19Y0csu8ABvTuUZDjOxGYmZWwlpZgVn1DwaqFwInAzKykvbJ6Exu3NjF5zJCCncOJwMyshNXW554UmlKOiUDS7ZJWSXq+k3LHSGqSdE6hYjEzK1e1ixoZ1KcHBw7rV7BzFPKO4A7g9J0VkFQFfA34QwHjMDMrW7X1DUweM5hu3VSwcxQsEUTE48C6Top9AvgVsKpQcZiZlav1r+9gwapNBa0WggzbCCSNBD4AfD9F2WmSZkqauXr16sIHZ2ZWAmYvbgQK2z4A2TYWfxv4fES0dFYwIm6OiOqIqB4+vN25l83MupzaRbmOZJNGDyroebKcvL4auFsSwDDgTElNEXFfhjGZmZWM2voGDh1RuI5krTJLBBExvnVZ0h3A/U4CZmY5LS3B7MWNvPfIAwp+roIlAkk/B04BhklaAnwR6AEQET8o1HnNzLqCuqQj2ZQxhetR3KpgiSAizt+FspcWKg4zs3JUuyjpSDa2sA3F4J7FZmYlqba+gcF9C9uRrJUTgZlZCaqtb2Ty6MEkD9QUlBOBmVmJWb9lB3VF6EjWyonAzKzEzFpcvPYBcCIwMys5tfWNdBNMKtCMZG05EZiZlZhZ9Q0cMmIA/XsVp6uXE4GZWQlpaQlm1zcWrVoInAjMzErKglWb2LitqWgNxeBEYGZWUv4+I1lx2gfAicDMrKTULmpgSN8ejC9CR7JWTgRmZiWktr6BKWOGFKUjWSsnAjOzEtG4ZTuvrN5c1IZicCIwMysZs5IZySYXsX0AnAjMzErGrEUNuY5ko5wIzMwqUm19I4ftN5B+RepI1sqJwMysBDQnM5JNGVvcuwFwIjAzKwkLVm1kU5E7krUqWCKQdLukVZKe72D/hZLmSnpO0pOSJhUqFjOzUle7KNdQ3KUSAXAHcPpO9r8KnBwRbwW+DNxcwFjMzEras4saGNqvJ2P36Vv0cxdyzuLHJY3byf4n81afBkYVKhYzs1I3q76BKWOKMyNZW6XSRnA58GDWQZiZZaFh83YWrtnM5AyqhaCAdwRpSfoncongHTspMw2YBjBmzJgiRWZmVhxvzEiWUSLI9I5A0pHArcDUiFjbUbmIuDkiqiOievjw4cUL0MysCGoXNVLVTUwaPSiT82eWCCSNAX4NfDgiXs4qDjOzrNXWN3DYfgPo2zObSpqCnVXSz4FTgGGSlgBfBHoARMQPgOuAfYDvJY0jTRFRXah4zMxKUXNLMGdxI2dPye55mUI+NXR+J/uvAK4o1PnNzMrBSys2snl7cyY9iluVylNDZmYV6e8zkmXTUAxOBGZmmaqtb2Cffj0ZM7T4HclaORGYmWVoVn0jk4s8I1lbTgRmZhlZt3k7r67ZnGn7ADgRmJllZlYJtA+AE4GZWWZq6xuo6iaOHJVNR7JWTgRmZhmpXdTI4ftn15GslROBmVkGmppbmLOkMfNqIXAiMDPLxEsrN7Jle3PpJwJJ3SR9qFjBmJlVitr67GYka2uniSAiWoD/KFIsZmYVY9aiBob178nooX2yDiVV1dAjkj4rabSkoa2vgkdmZtaF1dY3ZN6RrFWapupzk58fz9sWwIF7Pxwzs65v7aZtvLZ2C+ceUxoTbXWaCCJifDECMTOrFLPeaB/Itkdxq04TgaQewMeAk5JNfwR+GBE7ChiXmVmXVVvfQPdu4shRZZIIgO+Tm1Dme8n6h5NtnkvAzGw31NY3cPj+A+nTsyrrUIB0ieCYiJiUt/6opDmFCsjMrCtram5hzuL1fKg6uxnJ2krz1FCzpINaVyQdCDQXLiQzs67rxRUbeX1HM1PGZt9/oFWaRPBZ4DFJf5T0J+BR4DOdvUnS7ZJWSXq+g/2SdIOkOklzJU3ZtdDNzMpPqYw4mm+nVUOSqoBJwATg0GTzSxGxLcWx7wBuBO7qYP8ZyXEnAMeRa3c4LsVxzczKVm19I8P692LUkOw7krXqrGdxM3B+RGyLiLnJK00SICIeB9btpMhU4K7IeRoYLGn/1JGbmZWh2voGpowZXBIdyVqlqRr6i6QbJZ0oaUrray+ceySwOG99SbLNzKxLWrNpG4vWbimp9gFI99TQUcnP6/O2BfDOvR9O+yRNA6YBjBlTGj3xzMx21awSGmguX5o2gpqI+FYBzr0UGJ23PirZ9iYRcTNwM0B1dXUUIBYzs4L7e0eybGckaytVG0GBzl0DXJw8PXQ8sD4ilhfoXGZmmatd1MDEAwbSu0dpdCRrlaZq6C+SbgTuATa3boyI2p29SdLPgVOAYZKWAF8k10OZiPgBMB04E6gDtgAf2Y34zczKQlNzC3OXrOfcY0Z3XrjICtZGEBE7vZOIiOAfRzQ1M+uyWjuSTS6RgebypRl99J+KEYiZWVdWW4IdyVp1+viopBGSbpP0YLI+UdLlhQ/NzKzrqF3UwPABpdWRrFWafgR3AA8BByTrLwOfLlRAZmZdUW19Y8l1JGuVJhEMi4h7gRaAiGjCg86ZmaVWt2oT9eu2cPyB+2QdSrvSJILNkvYh10BM66OeBY3KzKwLqZmzDAnOfGtpjqKT5qmhfyf3zP9Bkv4CDAfOKWhUZmZdRERQM3spx4/fhxEDe2cdTrvSPDVUK+lkcqOPitzoo56m0swsheeWrue1tVu46uSDOi+ckTR3BK3tAi8UOBYzsy6nZvYyelSJM95SmtVCkK6NwMzMdkNzS/C7ucs4+ZDhDOrbI+twOuREYGZWIH97dR0rN2zjfZMO6LxwhlJVDUkaCYzNL59MPGNmZh2ombOMPj2qeM/EEVmHslOdJgJJXwPOBebx9/4DATgRmJl1YHtTCw8+v5z3TBxB356p/ubOTJro3g8cmnaKSjMzgz/XraZxyw7OKvFqIUjXRrCQZPhoMzNL57ezlzGoTw9OOmR41qF0Ks0dwRZgtqQZwBt3BRHxyYJFZWZWxl7f3szD81Yy9agD6Nm99J/JSZMIapKXmZml8Mj8lWzZ3lzyTwu1StOz+E5JPYFDkk3uWWxmthM1c5ax74BeHDe+NAeZayvNfASnAAuAm4DvAS9LOqnAcZmZlaX1W3bwx5dW8b5JB1DVrfSGnG5Pmsqr/wVOjYiTI+Ik4DTgW2kOLul0SS9JqpN0TTv7x0h6TNIsSXMlnblr4ZuZlZbfv7CcHc1RFk8LtUqTCHpExEutKxHxMimeIpJURe4u4gxgInC+pIltiv0f4N6ImAycR+6Ow8ysbNXMWcbYffpy5KhBWYeSWppEMFPSrZJOSV63ADNTvO9YoC4iFkbEduBuYGqbMgEMTJYHAcvSBm5mVmpWbdjKU6+s5axJB5TkTGQdSfPU0MeAjwOtj4s+Qbq/3EcCi/PWlwDHtSnzJeAPkj4B9APe3d6BJE0DpgGMGTMmxanNzIrv/rnLaQmYelT5VAtBijuCiNgWEd+MiLOT17f2Yi/j84E7ImIUcCbwY0lviikibo6I6oioHj689DtnmFllqpmzjMP3H8jB+w7IOpRd0uEdgaR7I+JDkp4jmaYyX0Qc2cmxlwKj89ZHJdvyXQ6cnhzvKUm9gWHAqhSxm5mVjPq1W5i9uJHPn35Y1qHssp1VDX0q+fne3Tz2M8AESePJJYDzgAvalKkH3gXcIelwoDewejfPZ2aWmd/NzTVxvm9S6U5A05EOq4YiYnmy+K8RsSj/BfxrZwdOZjW7GngImE/u6aAXJF0v6ayk2GeAKyXNAX4OXBoRb7r7MDMrdb+dvZTqsUMYNaRv1qHssjSNxe8BPt9m2xntbHuTiJgOTG+z7bq85XnA21PEYGZWsl5csYGXV27i+qlHZB3KbtlZG8HHyP3lf5CkuXm7BgBPFjowM7NyUTN7GVXdxJlvLb9qIdj5HcHPgAeBrwL5vYI3RsS6gkZlZlYmIoKaOcs44aB9GNa/V9bh7JadtRGsj4jXgO8A6/LaB5okte0PYGZWkWrrG1nS8DpTjxqZdSi7LU3P4u8Dm/LWNyXbzMwq3u/mLKNn926cdkRpz0u8M2kSgfKf5ImIFlJOem9m1pU1Nbdw/9zlvPPQfRnQu3wnckw1VaWkT0rqkbw+RW76SjOzivbUwrWs2bSt7IaUaCtNIrgKOIFcp7DW8YKmFTIoM7NyUDN7Gf17deefDts361D2SJoZylaR6xVsZmaJbU3N/P6FFZx6xAh696jKOpw90mkikPQj2h9r6LKCRGRmVgb++NJqNm5tKqsJaDqSptH3/rzl3sAH8LwBZlbhamYvY59+PXn7wcOyDmWPpaka+lX+uqSfA38uWERmZiVu07YmHpm/kg9Vj6ZHVZqm1tK2O/+CCUB5t4yYme2Bh+etYFtTC2eV+dNCrdK0EWzkH9sIVpBiwDkzs66qZvYyDhjUm6PHDMk6lL1ip4lAuUk3j4iI+iLFY2ZW0tZt3s4TC9Zw+Ynj6datfOYl3pmdVg0lPYofKFIsZmYlb/pzy2lqiS7xtFCrNG0EtZKOKXgkZmZloGbOMg4a3o+J+w/MOpS9Jk0iOA54StIrkuZKeq7N/ARmZhVhWePrPPPaOs6aNJJczXnXkKYfwWkFj8LMrAzcP3cZEXSZp4Vapbkj+Eo7cxZ/Jc3BJZ0u6SVJdZKu6aDMhyTNk/SCpJ/tSvBmZsVUM2cZR44axPhh/bIOZa9Kkwj+YRJOSVXA0Z29KSl3E7n5jScC50ua2KbMBOBa4O0RcQTw6ZRxm5kV1cLVm3h+6YYu1UjcqsNEIOnapA/BkZI2JK+NwCrgtymOfSxQFxELI2I7cDcwtU2ZK4GbIqIB3hjgzsys5NTMWYYE7z2yghJBRHw1IgYA34iIgclrQETsExHXpjj2SGBx3vqSZFu+Q4BDJP1F0tOSTm/vQJKmSZopaebq1atTnNrMbO+JCGpmL+O48UPZb1DvrMPZ69JUDd0vqR+ApIskfVPS2L10/u7khqw4BTgfuEXS4LaFIuLmiKiOiOrhw4fvpVObmaXz6IurWLhmM+ccPTrrUAoi7ZzFWyRNAj4DvALcleJ9S4H8qzYq2ZZvCVATETsi4lXgZXKJwcysJEQEN8xYwOihfcp+JrKOpEkETUkP46nAjRFxEzAgxfueASZIGi+pJ7nJbWralLmP3N0AkoaRqyryNJhmVjL+9PJq5ixZz8dPObhLjDTanjT/qo2SrgUuAh6Q1A3odJbmiGgCrgYeAuYD90bEC5Kul3RWUuwhYK2kecBjwOciYu3u/EPMzPa2iOA7MxYwcnAfzp4yKutwCiZNh7JzgQuAyyNihaQxwDfSHDwipgPT22y7Lm85gH9PXmZmJeXPdWuYVd/If37gLfTs3jXvBiDdxDQrgG/mrdeTro3AzKxsRQTfeWQBBwzqzTlHd927AUhRNSTpbEkLJK1v7UsgaUMxgjMzy8pTr6xl5qIGPnbKQfTqXt6T03cmTdXQ14H3RcT8QgdjZlYqvj1jASMG9uKD1V3zkdF8aSq9VjoJmFkleXrhWv726jo+dvJB9O7Rte8GIN0dwUxJ95B71HNb68aI+HXBojIzy9ANMxYwfEAvzjt2TNahFEWaRDAQ2AKcmrctACcCM+tynnltHU++spb/+96JFXE3AOmeGvpIMQIxMysFN8xYwLD+PbmgQu4GIN1TQ6Mk/UbSquT1K0ld+1kqM6tIzy5q4IkFa5h20oH06VkZdwOQrrH4R+SGhjggef0u2WZm1qXcMGMBQ/v15KLj99a4muUhTSIYHhE/ioim5HUH4CFAzaxLmb24kT+9vJorTzyQvj3TNJ92HWkSwdpk+Omq5HUR4PGAzKxLuWHGAob07cHFb6usuwFIlwguAz4ErACWA+cAbkA2sy7juSXrefTFVVxx4oH061VZdwOQ7qmhRcBZnZUzMytX35mxgEF9KvNuANI9NXRn/qxhkoZIur2wYZmZFcfzS9fzyPyVXP6O8Qzo3ekI+11SmqqhIyOisXUlmWh+cuFCMjMrnu8+uoABvbtzyQnjsg4lM2kSQTdJQ1pXJA0lXY9kM7OSNn/5Bh56YSWXvX08g/pU5t0ApPtC/1/gKUm/SNY/CPxn4UIyMyuOGx+to3+v7lz29vFZh5KpNI3Fd0maCbwz2XR2RMwrbFhmZoX18sqNTH9+OR8/5WAG9a3cuwFIVzUEMBTYHBE3AqslpUqfkk6X9JKkOknX7KTcv0gKSdUp4zEz2yPffbSOvj2quPwdlX03AOmeGvoi8Hng2mRTD+AnKd5XBdwEnAFMBM6XNLGdcgOATwF/TR+2mdnuq1u1kfvnLuPiE8YxpF/PrMPJXJo7gg+Q60ewGSAilgEDUrzvWKAuIhZGxHbgbmBqO+W+DHwN2JoqYjOzPXTjo3X06VHFlScemHUoJSFNItgeEUFuDgIk9Ut57JHA4rz1Jcm2N0iaAoyOiAd2diBJ0yTNlDRz9erVKU9vZvZmC1dvombOMj58/FiG+m4ASJcI7pX0Q2CwpCuBR4Bb9vTEkroB3wQ+01nZiLg5Iqojonr4cI93Z2a778bH6ujZvRtX+G7gDTt9akiSgHuAw4ANwKHAdRHxcIpjLwXyZ30elWxrNQB4C/DH3GnYD6iRdFZEzEz9LzAzS+m1NZv57exlfOSEcQwf0CvrcErGThNBRISk6RHxViDNl3++Z4AJyRNGS4HzgAvyjr0eGNa6LumPwGedBMysUG56rI7u3cS0k303kC9N1VCtpGN29cAR0QRcDTwEzAfujYgXJF0vyYPYmVlRLV63hV/PWsoFx41h3wG9sw6npKTpWXwccKGkReSeHBK5m4UjO3tjREwHprfZdl0HZU9JEYuZ2W656bE6qrqJq04+KOtQSk6aRHBawaMwMyugJ19Zwy+eXcKFx41hxEDfDbSVdj4CM7OytHD1Jj72k1oOGt6Pz552aNbhlKS0Q0yYmZWdxi3bufzOmXTvJm675BgGVuh8A53xcNJm1iVtb2rhYz+pZWnD6/zsyuMYPbRv1iGVLCcCM+tyIoL/e9/zPLVwLd86dxLV44ZmHVJJc9WQmXU5tz7xKvfMXMwn3nkwH5g8KutwSp4TgZl1KQ/PW8l/PTiff37r/vzbuw/JOpyy4ERgZl3GC8vW86m7Z3HkyEH8zwcn0a2bsg6pLDgRmFmXsGrDVq68cyaD+vTglour6dOzKuuQyoYbi82s7G3d0cyVd82k8fUd/OKqt7GvO43tEicCMytrLS3BZ+6dw9yl6/nhRUdzxAGDsg6p7LhqyMzK2rcfeZkHnlvOtWccxqlH7Jd1OGXJicDMytZ9s5Zyw6N1nFs92tNO7gEnAjMrSzNfW8d//HIuxx84lC+//y0kE1zZbnAiMLOys3jdFj7642c5YHBvfnDR0fTs7q+yPeGrZ2ZlZcPWHVx2xzPsaG7htkuPYXBfT0C/p/zUkJmVjabmFj7xs1m8umYzd152LAcN7591SF2CE4GZlY2vPDCfP728mq+e/VbefvCwzt9gqRS0akjS6ZJeklQn6Zp29v+7pHmS5kqaIWlsIeMxs/L146de444nX+OKd4zn/GPHZB1Ol1KwRCCpCrgJOAOYCJwvaWKbYrOA6mT+418CXy9UPGZWvh58bjlfrHmBdx22L9eeeXjW4XQ5hbwjOBaoi4iFEbEduBuYml8gIh6LiC3J6tOAx4s1s3/w2Iur+OTds5g8Zgg3nD+ZKg8kt9cVMhGMBBbnrS9JtnXkcuDB9nZImiZppqSZq1ev3oshmlkpe/KVNVz1k2c5dL8B3H7pMfTr5WbNQiiJx0clXQRUA99ob39E3BwR1RFRPXz48OIGZ2aZeHZRA1fcOZMxQ/ty12XHMaiP5xsulEKm16XA6Lz1Ucm2fyDp3cAXgJMjYlsB4zGzMvH80vVc+qO/se+AXvz0iuMY2s99BQqpkHcEzwATJI2X1BM4D6jJLyBpMvBD4KyIWFXAWMysTCxYuZGLb/8bA3v34KdXHu8hpYugYIkgIpqAq4GHgPnAvRHxgqTrJZ2VFPsG0B/4haTZkmo6OJyZVYBFazdz4a1/paqb+MkVxzFycJ+sQ6oIBW15iYjpwPQ2267LW353Ic9vZuVjWePrXHDLX9nR3MLd097G+GH9sg6pYpREY7GZVbZVG7dy4a1/ZcPrO7jrsuM4dL8BWYdUUfwslpllqmHzdj58699YsX4rP7niWN46yjOMFZsTgZllZsPWHVx8+994de1mfnTpMRw9dmjWIVUkVw2ZWSa2bG/i8jueYf7yDXz/wikeRC5DTgRmVnRbdzTz0R8/y7OLGvj2eUfxrsNHZB1SRXPVkJkV1Y7mFq7+WS1PLFjD/3xwEu898oCsQ6p4viMws6Jpbgn+7Z7ZPDJ/FV+eegTnHO1xJkuBE4GZFUVLS3DNr+Zy/9zlXHvGYXz4beOyDskSrhoys4Lb0dzC9b+bxy+eXcIn3zWBj558UNYhWR4nAjMrqL/UreGLNS9Qt2oTV544nn9794SsQ7I2nAjMrCCWNr7Ofz0wnweeW86YoX257ZJqPx1UopwIzGyv2tbUzK1PvMqNj9YRBJ95zyFcedKB9O5RlXVo1gEnAjPbax59cSXX/24er63dwhlv2Y8v/PPhjBrSN+uwrBNOBGa2xxat3cz1v5vHjBdXcdDwfvz48mM5cYJnEywXTgRmttte397M9/5Yxw8fX0iPbuILZx7OJSeMo2d3P5leTpwIzGyXRQS/f34FX3lgPksbX+cDk0dyzRmHMcKziZUlJwIz2yV1qzbypZp5/LluDYftN4B7P/o2jh3vUUPLmROBmaWyaVsTN8xYwO1/fpW+Pau4fuoRXHDsGLpXuRqo3BU0EUg6HfgOUAXcGhH/3WZ/L+Au4GhgLXBuRLxWyJjMrH2btjWxYv3W3GvDVlasf53l67eycsNWlq/fSv3aLWza3sS51aP53GmHsk//XlmHbHtJwRKBpCrgJuA9wBLgGUk1ETEvr9jlQENEHCzpPOBrwLmFismsK2tuCXY0t7C9uYWm5mS5qYUdzS00tQTbdrSwetNWVqzf9saXfO4LP/fauK3pTccc2q8nIwb2Zv9BvTlq9GA+WD2ao0YPzuBfZ4VUyDuCY4G6iFgIIOluYCqQnwimAl9Kln8J3ChJERF7O5g/vbyar9w/r/OCZiUqgKbmFnYkX/I72iy37MJvTTfBvgN6M2JQbw4a3p+3HzyM/QblvvD3G9ib/Qb1ZsTA3u4EViEKmQhGAovz1pcAx3VUJiKaJK0H9gHW5BeSNA2YBjBmzJjdCqZ/r+5MGNF/t95rViq6d+tGj6pu9OwuelTllrtXiZ7Jcu6lf1ju2b1b8j4xfEAv9hvUm+H9e7lu395QFo3FEXEzcDNAdXX1bt0tHD12CEePPXqvxmVm1hUU8k+CpcDovPVRybZ2y0jqDgwi12hsZmZFUshE8AwwQdJ4ST2B84CaNmVqgEuS5XOARwvRPmBmZh0rWNVQUud/NfAQucdHb4+IFyRdD8yMiBrgNuDHkuqAdeSShZmZFVFB2wgiYjowvc226/KWtwIfLGQMZma2c35swMyswjkRmJlVOCcCM7MK50RgZlbhVG5Pa0paDSzazbcPo02v5RJT6vFB6cfo+PaM49szpRzf2Ihod9q4sksEe0LSzIiozjqOjpR6fFD6MTq+PeP49kypx9cRVw2ZmVU4JwIzswpXaYng5qwD6ESpxwelH6Pj2zOOb8+Uenztqqg2AjMze7NKuyMwM7M2nAjMzCpcl0wEkk6X9JKkOknXtLO/l6R7kv1/lTSuiLGNlvSYpHmSXpD0qXbKnCJpvaTZyeu69o5VwBhfk/Rccu6Z7eyXpBuS6zdX0pQixnZo3nWZLWmDpE+3KVP06yfpdkmrJD2ft22opIclLUh+DungvZckZRZIuqS9MgWK7xuSXkz+D38jqd3JiDv7PBQwvi9JWpr3/3hmB+/d6e97AeO7Jy+21yTN7uC9Bb9+eywiutSL3JDXrwAHAj2BOcDENmX+FfhBsnwecE8R49sfmJIsDwBebie+U4D7M7yGrwHDdrL/TOBBQMDxwF8z/L9eQa6jTKbXDzgJmAI8n7ft68A1yfI1wNfaed9QYGHyc0iyPKRI8Z0KdE+Wv9ZefGk+DwWM70vAZ1N8Bnb6+16o+Nrs/1/guqyu356+uuIdwbFAXUQsjIjtwJ9Mfm0AAAW9SURBVN3A1DZlpgJ3Jsu/BN4lScUILiKWR0RtsrwRmE9u7uZyMhW4K3KeBgZL2j+DON4FvBIRu9vTfK+JiMfJzamRL/9zdifw/nbeehrwcESsi4gG4GHg9GLEFxF/iIimZPVpcrMIZqKD65dGmt/3Pbaz+JLvjg8BP9/b5y2WrpgIRgKL89aX8OYv2jfKJL8I64F9ihJdnqRKajLw13Z2v03SHEkPSjqiqIFBAH+Q9Kykae3sT3ONi+E8Ov7ly/L6tRoREcuT5RXAiHbKlMq1vIzcXV57Ovs8FNLVSdXV7R1UrZXC9TsRWBkRCzrYn+X1S6UrJoKyIKk/8Cvg0xGxoc3uWnLVHZOA7wL3FTm8d0TEFOAM4OOSTiry+TuVTH96FvCLdnZnff3eJHJ1BCX5rLakLwBNwE87KJLV5+H7wEHAUcByctUvpeh8dn43UPK/T10xESwFRuetj0q2tVtGUndgELC2KNHlztmDXBL4aUT8uu3+iNgQEZuS5elAD0nDihVfRCxNfq4CfkPu9jtfmmtcaGcAtRGxsu2OrK9fnpWtVWbJz1XtlMn0Wkq6FHgvcGGSrN4kxeehICJiZUQ0R0QLcEsH5836+nUHzgbu6ahMVtdvV3TFRPAMMEHS+OSvxvOAmjZlaoDWpzPOAR7t6Jdgb0vqE28D5kfENzsos19rm4WkY8n9PxUlUUnqJ2lA6zK5BsXn2xSrAS5Onh46HlifVwVSLB3+FZbl9Wsj/3N2CfDbdso8BJwqaUhS9XFqsq3gJJ0O/AdwVkRs6aBMms9DoeLLb3f6QAfnTfP7XkjvBl6MiCXt7czy+u2SrFurC/Ei91TLy+SeJvhCsu16ch94gN7kqhTqgL8BBxYxtneQqyKYC8xOXmcCVwFXJWWuBl4g9wTE08AJRYzvwOS8c5IYWq9ffnwCbkqu73NAdZH/f/uR+2IflLct0+tHLiktB3aQq6e+nFy70wxgAfAIMDQpWw3cmvfey5LPYh3wkSLGV0eufr31c9j6JN0BwPSdfR6KFN+Pk8/XXHJf7vu3jS9Zf9PvezHiS7bf0fq5yytb9Ou3py8PMWFmVuG6YtWQmZntAicCM7MK50RgZlbhnAjMzCqcE4GZWYXrnnUAZsUk6UvAJmAg8HhEPFLk859FblC0/y7mec12xo+PWkVpTQQR8T9Zx2JWKlw1ZF2epC9IelnSn4FDk213SDonWX5N0ldbx4uXNEXSQ5JekXRV3nE+J+mZZBC0/5dsGydpvqRblJtf4g+S+iT7PqncvBNzJd2dbLtU0o1573002T9D0pi82G6Q9KSkhXlx7i/p8STO5yWdWMTLaF2YE4F1aZKOJjfswFHkeqAe00HR+og4CniCXG/Rc8jNtdD6hX8qMIHcODFHAUfnDR42AbgpIo4AGoF/SbZfA0yOiCPJ9Xxu67vAncn+nwI35O3bn1wv9PcCrdVIFwAPJXFOItcb2GyPuY3AuroTgd9EMpaOpI7GoWnd/hzQP3JzRWyUtE25mbtOTV6zknL9ySWAeuDViGj9Un4WGJcszwV+Kuk+2h8B9W3kBiyD3HAKX8/bd1/kBlubJ6l1+OpngNuTQQvvyzun2R7xHYFZzrbkZ0vecut6d3LjK301Io5KXgdHxG1t3gvQzN//wPpncmMyTQGeSUaq3NV4SM5N5CZHOYnc6Jp3SLp4F45n1iEnAuvqHgfeL6lPMgrk+3bzOA8BlyXzSCBppKR9OyosqRswOiIeAz5Pbqjz/m2KPUmu2grgQnLVUh2SNJbcBCi3ALeSSzBme8xVQ9alRUStpHvIjf64ilz1yu4c5w+SDgeeSka43gRcRO4OoD1VwE8kDSL3F/0NEdGof5wR9RPAjyR9DlgNfKSTME4BPidpR3J+3xHYXuHHR83MKpyrhszMKpwTgZlZhXMiMDOrcE4EZmYVzonAzKzCORGYmVU4JwIzswr3/wH5lAAgq/EyTwAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "plt.plot(list(range(dimensions_range[1])), reconstruction_errors)\n",
    "plt.ylabel('reconstruction error')\n",
    "plt.xlabel('dimensions')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 68
    },
    "colab_type": "code",
    "id": "LdyBjzklKFLX",
    "outputId": "6c954151-f159-45bd-8295-5f8752ffcf49"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.00000000e+00, 5.97277824e-15, 4.53648713e-15, 2.67953071e-09,\n",
       "       1.55691443e-08, 8.63104004e-08, 2.11263688e-07, 3.52622045e-07,\n",
       "       5.93974854e-07, 9.35832024e-07, 1.47094691e-06])"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reconstruction_errors[dimensions_range[0]-1:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "id": "snLLb37Rv5EF",
    "outputId": "a0031ea8-1345-4086-8b64-b7c27467a6d5"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "chosen dimensions: 12\n"
     ]
    }
   ],
   "source": [
    "n_components = np.argmin(reconstruction_errors[dimensions_range[0]:]) + dimensions_range[0]+1\n",
    "print('chosen dimensions: {}'.format(n_components))\n",
    "lle = LocallyLinearEmbedding(\n",
    "    n_components=n_components, eigen_solver='auto', n_neighbors=2\n",
    ").fit(transactions)\n",
    "transactions_reduced = lle.transform(transactions)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "QH9gr1XTxBcU"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/ben/anaconda3/lib/python3.6/site-packages/sklearn/cluster/_affinity_propagation.py:152: FutureWarning: 'random_state' has been introduced in 0.23. It will be set to None starting from 0.25 which means that results will differ at every function call. Set 'random_state' to None to silence this warning, or to 0 to keep the behavior of versions <0.23.\n",
      "  FutureWarning)\n"
     ]
    }
   ],
   "source": [
    "from sklearn.cluster import AffinityPropagation\n",
    "clustering = AffinityPropagation(damping=0.9).fit(transactions_reduced)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "id": "KbB0hCjVyMge",
    "outputId": "e6d70b45-ebd0-43fc-89b9-aeb3691ab19e"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0])"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.unique(clustering.predict(transactions_reduced))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "Gih71BlXwTfv"
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "from sklearn.pipeline import make_pipeline\n",
    "from sklearn.preprocessing import StandardScaler\n",
    "from sklearn.decomposition import FastICA\n",
    "from sklearn.pipeline import make_pipeline\n",
    "from sklearn.svm import SVC\n",
    "from sklearn.base import TransformerMixin\n",
    "\n",
    "class Reduce(TransformerMixin):\n",
    "  def fit(self, X, y=None):\n",
    "    return self\n",
    "  def transform(self, X):\n",
    "    return X[:, :2]\n",
    "\n",
    "\n",
    "def plot_decision_boundary(data, kmeans, title='No title', h=.001, model=None, highlight_centroids=False, ica=Reduce()):\n",
    "    '''Based on https://scikit-learn.org/stable/auto_examples/cluster/plot_kmeans_digits.html\n",
    "  \n",
    "    Parameters\n",
    "    ----------\n",
    "    data - the dataset to be visualized with clusters\n",
    "    kmeans - the clustering algorithm\n",
    "    title - the title to displayed with the plot\n",
    "    h - Step size of the mesh. Decrease to increase the quality of the VQ.\n",
    "    model - model to re-learn projections in lower-dimensional space. Don't if None.\n",
    "    highlight_centroids - whether to show the centroids (False)\n",
    "      centroids might have little bearing in a different space.\n",
    "    ica - a dimensionality reduction method with fit and transform. This has to result in\n",
    "      two dimensions, e.g. FastICA(n_components=2)\n",
    "    '''\n",
    "    #ica = ica.fit(data)\n",
    "    reduced_data = ica.fit_transform(data)\n",
    "    if model is not None:\n",
    "      svc = model.fit(reduced_data, kmeans.predict(data))\n",
    "    # Plot the decision boundary. For that, we will assign a color to each\n",
    "    x_min, x_max = reduced_data[:, 0].min() - 1e-15, reduced_data[:, 0].max() + 1e-15\n",
    "    y_min, y_max = reduced_data[:, 1].min() - 1e-15, reduced_data[:, 1].max() + 1e-15\n",
    "    xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))\n",
    "\n",
    "    # Obtain labels for each point in mesh. Use last trained model.\n",
    "    if model:\n",
    "      preds = svc.predict(\n",
    "              np.c_[xx.ravel(), yy.ravel()]\n",
    "      )\n",
    "    elif data.shape[1] == 2:\n",
    "      preds = kmeans.predict(\n",
    "              np.c_[xx.ravel(), yy.ravel()]\n",
    "      )\n",
    "    else:\n",
    "      preds = kmeans.predict(\n",
    "          ica.inverse_transform(\n",
    "              np.c_[xx.ravel(), yy.ravel()]\n",
    "          )\n",
    "      )\n",
    "    pred_dict = {predval: i for i, predval in enumerate(np.unique(preds))}\n",
    "    Z = preds.reshape(xx.shape)\n",
    "    plt.figure(1)\n",
    "    plt.clf()\n",
    "    plt.imshow(\n",
    "        Z, interpolation=None,\n",
    "        extent=(xx.min(), xx.max(), yy.min(), yy.max()),\n",
    "        cmap=plt.cm.Paired,\n",
    "        aspect='auto', origin='lower',\n",
    "    )\n",
    "\n",
    "    plt.plot(reduced_data[:, 0], reduced_data[:, 1], 'k.', markersize=2)\n",
    "    plt.colorbar()\n",
    "    if highlight_centroids:\n",
    "    # Plot the centroids as a white X\n",
    "      centroids = ica.transform(kmeans.cluster_centers_)\n",
    "      centroids = np.array(\n",
    "          [centroid for i, centroid in enumerate(centroids) if i in pred_dict]\n",
    "      )\n",
    "      plt.scatter(\n",
    "          centroids[:, 0], centroids[:, 1],\n",
    "          marker='x', s=169, linewidths=3,\n",
    "          color='w', zorder=10\n",
    "      )\n",
    "    plt.title(title)\n",
    "    plt.xlim(x_min, x_max)\n",
    "    plt.ylim(y_min, y_max)\n",
    "    plt.xticks(())\n",
    "    plt.yticks(())\n",
    "    plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "E7yLql3fizSL"
   },
   "outputs": [],
   "source": [
    "y = clustering.predict(lle.transform(transactions))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 503
    },
    "colab_type": "code",
    "id": "fzvwP6GRJawX",
    "outputId": "2efca8f1-6228-484e-fe73-3c276b760657"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>creditamount</th>\n",
       "      <th>duration</th>\n",
       "      <th>installmentrate</th>\n",
       "      <th>residencesince</th>\n",
       "      <th>age</th>\n",
       "      <th>existingcredits</th>\n",
       "      <th>peopleliable</th>\n",
       "      <th>classification</th>\n",
       "      <th>existingchecking_A11</th>\n",
       "      <th>existingchecking_A12</th>\n",
       "      <th>...</th>\n",
       "      <th>housing_A152</th>\n",
       "      <th>housing_A153</th>\n",
       "      <th>job_A171</th>\n",
       "      <th>job_A172</th>\n",
       "      <th>job_A173</th>\n",
       "      <th>job_A174</th>\n",
       "      <th>telephone_A191</th>\n",
       "      <th>telephone_A192</th>\n",
       "      <th>foreignworker_A201</th>\n",
       "      <th>foreignworker_A202</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>y</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>3271.258</td>\n",
       "      <td>20.903</td>\n",
       "      <td>2.973</td>\n",
       "      <td>2.845</td>\n",
       "      <td>35.546</td>\n",
       "      <td>1.407</td>\n",
       "      <td>1.155</td>\n",
       "      <td>1.3</td>\n",
       "      <td>0.274</td>\n",
       "      <td>0.269</td>\n",
       "      <td>...</td>\n",
       "      <td>0.713</td>\n",
       "      <td>0.108</td>\n",
       "      <td>0.022</td>\n",
       "      <td>0.2</td>\n",
       "      <td>0.63</td>\n",
       "      <td>0.148</td>\n",
       "      <td>0.596</td>\n",
       "      <td>0.404</td>\n",
       "      <td>0.963</td>\n",
       "      <td>0.037</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1 rows × 62 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   creditamount  duration  installmentrate  residencesince     age  \\\n",
       "y                                                                    \n",
       "0      3271.258    20.903            2.973           2.845  35.546   \n",
       "\n",
       "   existingcredits  peopleliable  classification  existingchecking_A11  \\\n",
       "y                                                                        \n",
       "0            1.407         1.155             1.3                 0.274   \n",
       "\n",
       "   existingchecking_A12  ...  housing_A152  housing_A153  job_A171  job_A172  \\\n",
       "y                        ...                                                   \n",
       "0                 0.269  ...         0.713         0.108     0.022       0.2   \n",
       "\n",
       "   job_A173  job_A174  telephone_A191  telephone_A192  foreignworker_A201  \\\n",
       "y                                                                           \n",
       "0      0.63     0.148           0.596           0.404               0.963   \n",
       "\n",
       "   foreignworker_A202  \n",
       "y                      \n",
       "0               0.037  \n",
       "\n",
       "[1 rows x 62 columns]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "transactions.join(pd.DataFrame(data=y, columns=['y'])).groupby(by='y').mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "2j8ReVhwBfG3"
   },
   "source": [
    "We could also argue that what really matters to us is that customers can pay back the loans. For example, when we pay marketing expenses, we want to really spend on the potential customers that we can make money with or (reversely) don't lose with."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "0KXlsFV5A7z3"
   },
   "outputs": [],
   "source": [
    "X = transactions.drop(columns=['classification'])\n",
    "y = transactions['classification']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "PD8vQJAqCwt_"
   },
   "outputs": [],
   "source": [
    "from sklearn.ensemble import RandomForestClassifier\n",
    "\n",
    "rf = RandomForestClassifier().fit(X, y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "okIvJse_C4Fd"
   },
   "outputs": [],
   "source": [
    "DP, n_nodes_ptr = rf.decision_path(X)\n",
    "# The columns from indicator[n_nodes_ptr[i]:n_nodes_ptr[i+1]] gives the indicator value for the i-th estimator.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 221
    },
    "colab_type": "code",
    "id": "IGCIC2B5KGaW",
    "outputId": "da33bd5f-3adc-4098-96ca-ad5a3fc37997"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([    0,   481,   912,  1403,  1842,  2301,  2774,  3249,  3734,\n",
       "        4203,  4738,  5251,  5726,  6197,  6662,  7127,  7588,  8115,\n",
       "        8576,  9051,  9482,  9957, 10404, 10915, 11338, 11799, 12282,\n",
       "       12747, 13202, 13687, 14160, 14623, 15044, 15493, 15904, 16385,\n",
       "       16886, 17381, 17824, 18235, 18694, 19181, 19656, 20129, 20604,\n",
       "       21075, 21540, 21959, 22398, 22861, 23350, 23791, 24240, 24729,\n",
       "       25252, 25723, 26198, 26627, 27092, 27557, 28024, 28495, 28986,\n",
       "       29489, 29974, 30413, 30834, 31267, 31758, 32243, 32694, 33179,\n",
       "       33640, 34113, 34604, 35085, 35572, 36057, 36528, 37031, 37496,\n",
       "       37977, 38466, 38953, 39428, 39859, 40340, 40787, 41242, 41699,\n",
       "       42138, 42651, 43126, 43659, 44142, 44645, 45134, 45591, 46024,\n",
       "       46483, 46892])"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "n_nodes_ptr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "id": "ZpbDD_6KKzj8",
    "outputId": "74148ce4-9236-4aa3-f776-3682055cfe82"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(1000, 46892)"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "DP.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "Zi1QTmMJHycf"
   },
   "outputs": [],
   "source": [
    "def reduce_decision_path(indicator, n_nodes_ptr, level=2):\n",
    "  def level_offsets(level):\n",
    "    if level > 0:\n",
    "      return np.array([2**(level-1), 2**level], dtype=np.int64)\n",
    "    else:\n",
    "      return np.array([0, 1], dtype=np.int64)\n",
    "\n",
    "  n_estimators = len(n_nodes_ptr)\n",
    "  X = indicator.todense()\n",
    "  offsets = level_offsets(level)\n",
    "  nodes = []\n",
    "  for est_ptr in n_nodes_ptr:\n",
    "    indices = offsets+est_ptr\n",
    "    if indices[-1] <= X.shape[1]:\n",
    "      nodes.append(X[:, indices[0]: indices[1]])  #.reshape(X.shape[0], indices.shape[-1]))\n",
    "  return np.column_stack(nodes)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "_516aQK_KtzC"
   },
   "outputs": [],
   "source": [
    "matrix = StandardScaler().fit_transform(np.column_stack([                                                      \n",
    "  reduce_decision_path(DP, n_nodes_ptr, level=1).mean(axis=-1),\n",
    "  reduce_decision_path(DP, n_nodes_ptr, level=2).mean(axis=-1),\n",
    "]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "GZXFqHZ_DM1I"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/ben/anaconda3/lib/python3.6/site-packages/sklearn/cluster/_affinity_propagation.py:152: FutureWarning: 'random_state' has been introduced in 0.23. It will be set to None starting from 0.25 which means that results will differ at every function call. Set 'random_state' to None to silence this warning, or to 0 to keep the behavior of versions <0.23.\n",
      "  FutureWarning)\n"
     ]
    }
   ],
   "source": [
    "from sklearn.cluster import AffinityPropagation\n",
    "from sklearn.cluster import AgglomerativeClustering\n",
    "from scipy.spatial.distance import pdist, squareform\n",
    "\n",
    "distances = squareform(pdist(matrix))\n",
    "clustering = AffinityPropagation(damping=0.9).fit(distances)\n",
    "clustering2 = AgglomerativeClustering(\n",
    "    n_clusters=5, affinity='precomputed', linkage='average'\n",
    ").fit(distances)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "id": "ESdKCq9tRgB5",
    "outputId": "018c886c-50e3-4e28-a9f0-8d7906aee77a"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "14"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.max(clustering.predict(distances))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 545
    },
    "colab_type": "code",
    "id": "N0R4zu2cnghI",
    "outputId": "6251b318-2b00-4518-8379-c9d3912dd3a1"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age_mean</th>\n",
       "      <th>age_std</th>\n",
       "      <th>creditamount</th>\n",
       "      <th>duration</th>\n",
       "      <th>count</th>\n",
       "      <th>class_mean</th>\n",
       "      <th>class_std</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>cluster</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>34.864865</td>\n",
       "      <td>11.394933</td>\n",
       "      <td>2518.837838</td>\n",
       "      <td>17.459459</td>\n",
       "      <td>74</td>\n",
       "      <td>1.202703</td>\n",
       "      <td>0.404757</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>34.029851</td>\n",
       "      <td>11.262663</td>\n",
       "      <td>2578.492537</td>\n",
       "      <td>17.835821</td>\n",
       "      <td>67</td>\n",
       "      <td>1.208955</td>\n",
       "      <td>0.409631</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>39.740741</td>\n",
       "      <td>13.357777</td>\n",
       "      <td>6341.259259</td>\n",
       "      <td>34.370370</td>\n",
       "      <td>27</td>\n",
       "      <td>1.222222</td>\n",
       "      <td>0.423659</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>39.445946</td>\n",
       "      <td>11.125858</td>\n",
       "      <td>4091.094595</td>\n",
       "      <td>26.972973</td>\n",
       "      <td>74</td>\n",
       "      <td>1.229730</td>\n",
       "      <td>0.423530</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>32.523810</td>\n",
       "      <td>10.704998</td>\n",
       "      <td>2472.476190</td>\n",
       "      <td>16.738095</td>\n",
       "      <td>42</td>\n",
       "      <td>1.238095</td>\n",
       "      <td>0.431081</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>37.851064</td>\n",
       "      <td>11.313668</td>\n",
       "      <td>5182.255319</td>\n",
       "      <td>27.170213</td>\n",
       "      <td>47</td>\n",
       "      <td>1.255319</td>\n",
       "      <td>0.440755</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>33.000000</td>\n",
       "      <td>11.904926</td>\n",
       "      <td>1741.434783</td>\n",
       "      <td>13.217391</td>\n",
       "      <td>23</td>\n",
       "      <td>1.260870</td>\n",
       "      <td>0.448978</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>39.000000</td>\n",
       "      <td>11.317636</td>\n",
       "      <td>4075.659341</td>\n",
       "      <td>23.307692</td>\n",
       "      <td>91</td>\n",
       "      <td>1.263736</td>\n",
       "      <td>0.443099</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>33.183673</td>\n",
       "      <td>10.821272</td>\n",
       "      <td>3266.959184</td>\n",
       "      <td>20.663265</td>\n",
       "      <td>98</td>\n",
       "      <td>1.285714</td>\n",
       "      <td>0.454077</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>32.611111</td>\n",
       "      <td>10.165144</td>\n",
       "      <td>1695.000000</td>\n",
       "      <td>12.194444</td>\n",
       "      <td>36</td>\n",
       "      <td>1.305556</td>\n",
       "      <td>0.467177</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>32.092105</td>\n",
       "      <td>11.413650</td>\n",
       "      <td>2423.460526</td>\n",
       "      <td>16.276316</td>\n",
       "      <td>76</td>\n",
       "      <td>1.315789</td>\n",
       "      <td>0.467918</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>36.292929</td>\n",
       "      <td>10.589996</td>\n",
       "      <td>3404.424242</td>\n",
       "      <td>21.181818</td>\n",
       "      <td>99</td>\n",
       "      <td>1.353535</td>\n",
       "      <td>0.480500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>31.000000</td>\n",
       "      <td>12.363541</td>\n",
       "      <td>1932.250000</td>\n",
       "      <td>12.750000</td>\n",
       "      <td>8</td>\n",
       "      <td>1.375000</td>\n",
       "      <td>0.517549</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>34.620690</td>\n",
       "      <td>11.243282</td>\n",
       "      <td>2827.034483</td>\n",
       "      <td>20.232759</td>\n",
       "      <td>116</td>\n",
       "      <td>1.379310</td>\n",
       "      <td>0.487320</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>37.040984</td>\n",
       "      <td>11.192085</td>\n",
       "      <td>3557.418033</td>\n",
       "      <td>23.278689</td>\n",
       "      <td>122</td>\n",
       "      <td>1.418033</td>\n",
       "      <td>0.495270</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          age_mean    age_std  creditamount   duration  count  class_mean  \\\n",
       "cluster                                                                     \n",
       "5        34.864865  11.394933   2518.837838  17.459459     74    1.202703   \n",
       "0        34.029851  11.262663   2578.492537  17.835821     67    1.208955   \n",
       "11       39.740741  13.357777   6341.259259  34.370370     27    1.222222   \n",
       "12       39.445946  11.125858   4091.094595  26.972973     74    1.229730   \n",
       "2        32.523810  10.704998   2472.476190  16.738095     42    1.238095   \n",
       "3        37.851064  11.313668   5182.255319  27.170213     47    1.255319   \n",
       "4        33.000000  11.904926   1741.434783  13.217391     23    1.260870   \n",
       "7        39.000000  11.317636   4075.659341  23.307692     91    1.263736   \n",
       "1        33.183673  10.821272   3266.959184  20.663265     98    1.285714   \n",
       "13       32.611111  10.165144   1695.000000  12.194444     36    1.305556   \n",
       "6        32.092105  11.413650   2423.460526  16.276316     76    1.315789   \n",
       "10       36.292929  10.589996   3404.424242  21.181818     99    1.353535   \n",
       "14       31.000000  12.363541   1932.250000  12.750000      8    1.375000   \n",
       "8        34.620690  11.243282   2827.034483  20.232759    116    1.379310   \n",
       "9        37.040984  11.192085   3557.418033  23.278689    122    1.418033   \n",
       "\n",
       "         class_std  \n",
       "cluster             \n",
       "5         0.404757  \n",
       "0         0.409631  \n",
       "11        0.423659  \n",
       "12        0.423530  \n",
       "2         0.431081  \n",
       "3         0.440755  \n",
       "4         0.448978  \n",
       "7         0.443099  \n",
       "1         0.454077  \n",
       "13        0.467177  \n",
       "6         0.467918  \n",
       "10        0.480500  \n",
       "14        0.517549  \n",
       "8         0.487320  \n",
       "9         0.495270  "
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y = clustering.predict(distances)\n",
    "clusters = transactions.join(\n",
    "    pd.DataFrame(data=y, columns=['cluster'])\n",
    ").groupby(by='cluster').agg(\n",
    "    age_mean=pd.NamedAgg(column='age', aggfunc='mean'),\n",
    "    age_std=pd.NamedAgg(column='age', aggfunc='std'),\n",
    "    creditamount=pd.NamedAgg(column='creditamount', aggfunc='mean'),\n",
    "    duration=pd.NamedAgg(column='duration', aggfunc='mean'),\n",
    "    count=pd.NamedAgg(column='age', aggfunc='count'),\n",
    "    class_mean=pd.NamedAgg(column='classification', aggfunc='mean'),\n",
    "    class_std=pd.NamedAgg(column='classification', aggfunc='std'),\n",
    ").sort_values(by='class_mean')\n",
    "clusters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 343
    },
    "colab_type": "code",
    "id": "xltIqbaeFner",
    "outputId": "008a3d25-0b00-4f7a-d571-18aafdc67f75"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: dython in /Users/ben/anaconda3/lib/python3.6/site-packages (0.6.1)\n",
      "Requirement already satisfied: matplotlib in /Users/ben/.local/lib/python3.6/site-packages (from dython) (3.2.2)\n",
      "Requirement already satisfied: seaborn in /Users/ben/anaconda3/lib/python3.6/site-packages (from dython) (0.9.0)\n",
      "Requirement already satisfied: scipy in /Users/ben/anaconda3/lib/python3.6/site-packages (from dython) (1.5.2)\n",
      "Requirement already satisfied: numpy in /Users/ben/anaconda3/lib/python3.6/site-packages (from dython) (1.19.1)\n",
      "Requirement already satisfied: pandas>=0.23.4 in /Users/ben/anaconda3/lib/python3.6/site-packages (from dython) (1.1.0)\n",
      "Requirement already satisfied: scikit-learn in /Users/ben/anaconda3/lib/python3.6/site-packages (from dython) (0.23.2)\n",
      "Requirement already satisfied: kiwisolver>=1.0.1 in /Users/ben/anaconda3/lib/python3.6/site-packages (from matplotlib->dython) (1.0.1)\n",
      "Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /Users/ben/anaconda3/lib/python3.6/site-packages (from matplotlib->dython) (2.4.7)\n",
      "Requirement already satisfied: cycler>=0.10 in /Users/ben/anaconda3/lib/python3.6/site-packages (from matplotlib->dython) (0.10.0)\n",
      "Requirement already satisfied: python-dateutil>=2.1 in /Users/ben/anaconda3/lib/python3.6/site-packages (from matplotlib->dython) (2.8.1)\n",
      "Requirement already satisfied: pytz>=2017.2 in /Users/ben/anaconda3/lib/python3.6/site-packages (from pandas>=0.23.4->dython) (2018.7)\n",
      "Requirement already satisfied: joblib>=0.11 in /Users/ben/anaconda3/lib/python3.6/site-packages (from scikit-learn->dython) (0.16.0)\n",
      "Requirement already satisfied: threadpoolctl>=2.0.0 in /Users/ben/anaconda3/lib/python3.6/site-packages (from scikit-learn->dython) (2.1.0)\n",
      "Requirement already satisfied: setuptools in /Users/ben/anaconda3/lib/python3.6/site-packages (from kiwisolver>=1.0.1->matplotlib->dython) (49.6.0.post20200814)\n",
      "Requirement already satisfied: six in /Users/ben/anaconda3/lib/python3.6/site-packages (from cycler>=0.10->matplotlib->dython) (1.15.0)\n",
      "Note: you may need to restart the kernel to use updated packages.\n"
     ]
    }
   ],
   "source": [
    "pip install dython"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 300
    },
    "colab_type": "code",
    "id": "TH3hCLb5Foj_",
    "outputId": "2f99727b-c922-491e-a329-191f29ca9fb6"
   },
   "outputs": [],
   "source": [
    "from dython.nominal import correlation_ratio"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 300
    },
    "colab_type": "code",
    "id": "y5gyf7b6z3tr",
    "outputId": "e15f1d35-ba25-4d4d-96ec-e1d4954ffab3"
   },
   "outputs": [],
   "source": [
    "from dython.nominal import associations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "id": "qnCPhhmRF_gR",
    "outputId": "c20c654d-0716-4a15-fa61-e7f1b93a7787"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.06483765834312989"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "correlation_ratio(transactions['classification'].values, y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 235
    },
    "colab_type": "code",
    "id": "0welHRx7EcsT",
    "outputId": "9a2ba5e8-62ff-4d32-a4d3-3c34e451d70d"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age_mean</th>\n",
       "      <th>age_std</th>\n",
       "      <th>creditamount</th>\n",
       "      <th>duration</th>\n",
       "      <th>count</th>\n",
       "      <th>class_mean</th>\n",
       "      <th>class_std</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>cluster</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>47.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5103.000000</td>\n",
       "      <td>24.000000</td>\n",
       "      <td>1</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>38.979079</td>\n",
       "      <td>11.482339</td>\n",
       "      <td>4533.991632</td>\n",
       "      <td>26.326360</td>\n",
       "      <td>239</td>\n",
       "      <td>1.238494</td>\n",
       "      <td>0.427057</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>33.959108</td>\n",
       "      <td>11.343443</td>\n",
       "      <td>2616.107807</td>\n",
       "      <td>17.613383</td>\n",
       "      <td>269</td>\n",
       "      <td>1.249071</td>\n",
       "      <td>0.433281</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>31.202703</td>\n",
       "      <td>10.389641</td>\n",
       "      <td>1752.094595</td>\n",
       "      <td>13.108108</td>\n",
       "      <td>74</td>\n",
       "      <td>1.283784</td>\n",
       "      <td>0.453911</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>35.345324</td>\n",
       "      <td>11.016844</td>\n",
       "      <td>3235.354916</td>\n",
       "      <td>21.292566</td>\n",
       "      <td>417</td>\n",
       "      <td>1.371703</td>\n",
       "      <td>0.483840</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          age_mean    age_std  creditamount   duration  count  class_mean  \\\n",
       "cluster                                                                     \n",
       "3        47.000000        NaN   5103.000000  24.000000      1    1.000000   \n",
       "1        38.979079  11.482339   4533.991632  26.326360    239    1.238494   \n",
       "2        33.959108  11.343443   2616.107807  17.613383    269    1.249071   \n",
       "0        31.202703  10.389641   1752.094595  13.108108     74    1.283784   \n",
       "4        35.345324  11.016844   3235.354916  21.292566    417    1.371703   \n",
       "\n",
       "         class_std  \n",
       "cluster             \n",
       "3              NaN  \n",
       "1         0.427057  \n",
       "2         0.433281  \n",
       "0         0.453911  \n",
       "4         0.483840  "
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y = clustering2.labels_\n",
    "clusters = transactions.join(\n",
    "    pd.DataFrame(data=y, columns=['cluster'])\n",
    ").groupby(by='cluster').agg(\n",
    "    age_mean=pd.NamedAgg(column='age', aggfunc='mean'),\n",
    "    age_std=pd.NamedAgg(column='age', aggfunc='std'),\n",
    "    creditamount=pd.NamedAgg(column='creditamount', aggfunc='mean'),\n",
    "    duration=pd.NamedAgg(column='duration', aggfunc='mean'),\n",
    "    count=pd.NamedAgg(column='age', aggfunc='count'),\n",
    "    class_mean=pd.NamedAgg(column='classification', aggfunc='mean'),\n",
    "    class_std=pd.NamedAgg(column='classification', aggfunc='std'),\n",
    ").sort_values(by='class_mean')\n",
    "clusters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "id": "8M1YzhIJGq5x",
    "outputId": "43c99ab0-83bf-4679-ca97-1e481a4f5aaa"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.11809909247898157"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "correlation_ratio(transactions['classification'].values, y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "YiKVmys0H9eb"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/ben/anaconda3/lib/python3.6/site-packages/sklearn/cluster/_affinity_propagation.py:152: FutureWarning: 'random_state' has been introduced in 0.23. It will be set to None starting from 0.25 which means that results will differ at every function call. Set 'random_state' to None to silence this warning, or to 0 to keep the behavior of versions <0.23.\n",
      "  FutureWarning)\n"
     ]
    }
   ],
   "source": [
    "distances = squareform(pdist(\n",
    "    StandardScaler().fit_transform(\n",
    "        transactions[['classification', 'creditamount', 'duration']]\n",
    "    )\n",
    "))\n",
    "clustering3 = AffinityPropagation(damping=0.9).fit(distances)\n",
    "clustering4 = AgglomerativeClustering(\n",
    "    n_clusters=5, affinity='precomputed', linkage='average'\n",
    ").fit(distances)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 68
    },
    "colab_type": "code",
    "id": "YJx_NKI2IZWp",
    "outputId": "1bc83c60-c332-4dd0-9448-879b3d74ba62"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "23\n",
      "0.00887681559456858\n",
      "0.3178388028194592\n"
     ]
    }
   ],
   "source": [
    "print(np.max(clustering3.predict(distances)))\n",
    "print(correlation_ratio(transactions['classification'].values, clustering3.predict(distances)))\n",
    "print(correlation_ratio(transactions['classification'].values, clustering4.labels_))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 235
    },
    "colab_type": "code",
    "id": "6YOehaj2IxbX",
    "outputId": "45ff341e-5691-4e0d-d60b-bbc07892ff55"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age_mean</th>\n",
       "      <th>age_std</th>\n",
       "      <th>creditamount</th>\n",
       "      <th>duration</th>\n",
       "      <th>count</th>\n",
       "      <th>class_mean</th>\n",
       "      <th>class_std</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>cluster</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>35.920000</td>\n",
       "      <td>7.348025</td>\n",
       "      <td>8289.740000</td>\n",
       "      <td>42.640000</td>\n",
       "      <td>50</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>35.548165</td>\n",
       "      <td>11.489624</td>\n",
       "      <td>2473.405963</td>\n",
       "      <td>17.779817</td>\n",
       "      <td>872</td>\n",
       "      <td>1.259174</td>\n",
       "      <td>0.438433</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>38.200000</td>\n",
       "      <td>16.457352</td>\n",
       "      <td>15271.600000</td>\n",
       "      <td>51.300000</td>\n",
       "      <td>10</td>\n",
       "      <td>1.600000</td>\n",
       "      <td>0.516398</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>34.530303</td>\n",
       "      <td>10.989813</td>\n",
       "      <td>7845.363636</td>\n",
       "      <td>41.545455</td>\n",
       "      <td>66</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>45.500000</td>\n",
       "      <td>31.819805</td>\n",
       "      <td>14725.500000</td>\n",
       "      <td>6.000000</td>\n",
       "      <td>2</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          age_mean    age_std  creditamount   duration  count  class_mean  \\\n",
       "cluster                                                                     \n",
       "4        35.920000   7.348025   8289.740000  42.640000     50    1.000000   \n",
       "2        35.548165  11.489624   2473.405963  17.779817    872    1.259174   \n",
       "0        38.200000  16.457352  15271.600000  51.300000     10    1.600000   \n",
       "1        34.530303  10.989813   7845.363636  41.545455     66    2.000000   \n",
       "3        45.500000  31.819805  14725.500000   6.000000      2    2.000000   \n",
       "\n",
       "         class_std  \n",
       "cluster             \n",
       "4         0.000000  \n",
       "2         0.438433  \n",
       "0         0.516398  \n",
       "1         0.000000  \n",
       "3         0.000000  "
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y = clustering4.labels_\n",
    "clusters = transactions.join(\n",
    "    pd.DataFrame(data=y, columns=['cluster'])\n",
    ").groupby(by='cluster').agg(\n",
    "    age_mean=pd.NamedAgg(column='age', aggfunc='mean'),\n",
    "    age_std=pd.NamedAgg(column='age', aggfunc='std'),\n",
    "    creditamount=pd.NamedAgg(column='creditamount', aggfunc='mean'),\n",
    "    duration=pd.NamedAgg(column='duration', aggfunc='mean'),\n",
    "    count=pd.NamedAgg(column='age', aggfunc='count'),\n",
    "    class_mean=pd.NamedAgg(column='classification', aggfunc='mean'),\n",
    "    class_std=pd.NamedAgg(column='classification', aggfunc='std'),\n",
    ").sort_values(by='class_mean')\n",
    "clusters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "QxJSfAqMqVYF"
   },
   "outputs": [],
   "source": [
    "# this is from napkin-ml\n",
    "# https://github.com/eriklindernoren/NapkinML/blob/master/napkin_ml/utils/misc.py\n",
    "class Plot():\n",
    "    def __init__(self, colormap='viridis'):\n",
    "        self.cmap = plt.get_cmap(colormap)\n",
    "\n",
    "    def _transform(self, X, dim):\n",
    "        e_val, e_vec = np.linalg.eig(np.cov(X, rowvar=False))\n",
    "        idx = e_val.argsort()[::-1]\n",
    "        e_vec = e_vec[:, idx][:, :dim]\n",
    "        return X.dot(e_vec)\n",
    "\n",
    "    # Plot the dataset X and the corresponding labels y in 2D using PCA.\n",
    "    def plot_in_2d(self, X, y=None, title=None, accuracy=None, legend_labels=None):\n",
    "        X_transformed = self._transform(X, dim=2)\n",
    "        x1 = X_transformed[:, 0]\n",
    "        x2 = X_transformed[:, 1]\n",
    "        class_distr = []\n",
    "\n",
    "        y = np.array(y).astype(int)\n",
    "\n",
    "        colors = [self.cmap(i) for i in np.linspace(0, 1, len(np.unique(y)))]\n",
    "\n",
    "        # Plot the different class distributions\n",
    "        for i, l in enumerate(np.unique(y)):\n",
    "            _x1 = x1[y == l]\n",
    "            _x2 = x2[y == l]\n",
    "            _y = y[y == l]\n",
    "            class_distr.append(plt.scatter(_x1, _x2, color=colors[i]))\n",
    "\n",
    "        # Plot legend\n",
    "        if not legend_labels is None:\n",
    "            plt.legend(class_distr, legend_labels, loc=1)\n",
    "\n",
    "        # Plot title\n",
    "        if title:\n",
    "            if accuracy:\n",
    "                perc = 100 * accuracy\n",
    "                plt.suptitle(title)\n",
    "                plt.title(\"Accuracy: %.1f%%\" % perc, fontsize=10)\n",
    "            else:\n",
    "                plt.title(title)\n",
    "\n",
    "        # Axis labels\n",
    "        plt.xlabel('Principal Component 1')\n",
    "        plt.ylabel('Principal Component 2')\n",
    "\n",
    "        plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "lvQPqRAXpAeW"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/ben/anaconda3/lib/python3.6/site-packages/jax/lib/xla_bridge.py:125: UserWarning: No GPU/TPU found, falling back to CPU.\n",
      "  warnings.warn('No GPU/TPU found, falling back to CPU.')\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(2, 4)\n",
      "(150,)\n",
      "(3, 4)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "DeviceArray([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,\n",
       "             2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,\n",
       "             2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 1, 0, 0, 0, 1, 0, 2,\n",
       "             1, 2, 1, 0, 2, 0, 0, 2, 0, 1, 0, 2, 0, 0, 0, 0, 0, 0, 0, 2,\n",
       "             1, 1, 2, 0, 2, 0, 0, 0, 2, 1, 2, 0, 2, 1, 2, 2, 2, 0, 1, 2,\n",
       "             0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n",
       "             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n",
       "             0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int32)"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import jax.numpy as jnp\n",
    "import numpy as np\n",
    "from jax import grad, jit, vmap\n",
    "from sklearn.base import ClassifierMixin\n",
    "import jax\n",
    "import random\n",
    "from scipy.stats import hmean\n",
    "\n",
    "\n",
    "class KMeans(ClassifierMixin):\n",
    "    def __init__(self, k, n_iter=100):\n",
    "      self.k = k\n",
    "      self.n_iter = n_iter\n",
    "      self.euclidean = jit(vmap(\n",
    "          lambda x, y: jnp.linalg.norm(\n",
    "              x - y, ord=2, axis=-1, keepdims=False\n",
    "          ), in_axes=(0, None), out_axes=0\n",
    "      ))\n",
    "      \n",
    "    def adjust_centers(self, X):\n",
    "        jnp.row_stack([X[self.clusters == c].mean(axis=0)\n",
    "          for c in self.clusters\n",
    "        ])\n",
    "\n",
    "    def initialize_centers(self):\n",
    "        '''roughly the kmeans++ initialization\n",
    "        '''\n",
    "        key = jax.random.PRNGKey(0)\n",
    "        # jax doesn't have uniform_multivariate\n",
    "        self.centers = jax.random.multivariate_normal(\n",
    "            key, jnp.mean(X, axis=0), jnp.cov(X, rowvar=False), shape=(1,)\n",
    "        )\n",
    "        for c in range(1, self.k):\n",
    "            weights = self.euclidean(X, self.centers)\n",
    "            if c>1:\n",
    "              weights = hmean(weights ,axis=-1)\n",
    "              print(weights.shape)\n",
    "\n",
    "            new_center = jnp.array(\n",
    "                random.choices(X, weights=weights, k=1)[0],\n",
    "                ndmin=2\n",
    "            )\n",
    "            self.centers = jnp.row_stack(\n",
    "                (self.centers, new_center)\n",
    "            )\n",
    "            print(self.centers.shape)\n",
    "\n",
    "    def fit(self, X, y=None):\n",
    "        self.initialize_centers()\n",
    "        for iter in range(self.n_iter):\n",
    "            dists = self.euclidean(X, self.centers)\n",
    "            self.clusters = jnp.argmin(dists, axis=-1)\n",
    "            self.adjust_centers(X)\n",
    "        return self.clusters\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "from sklearn.datasets import load_iris\n",
    "\n",
    "X, y = load_iris(return_X_y=True)\n",
    "\n",
    "kmeans = KMeans(k=3)\n",
    "kmeans.fit(X)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "mq6IYpxj29US"
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYoAAAEWCAYAAAB42tAoAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3de5wcVZn/8c+3cyGETOSSRBIJRlAQVhAhoIsEiYEgokH8qWhQQX6YBVEQFMXo6ycoGy8o7OIimMUgyzIqKpggLAQCxKgLJEEhXIwiIhDAhCAkIZDL9PP7o6onPT1V1dU93V3VPc/79epXZqqru84MQz19znPOc2RmOOecc3EKWTfAOedcvnmgcM45l8gDhXPOuUQeKJxzziXyQOGccy6RBwrnnHOJPFA455xL5IHCuTKSHpd0ZNn3H5b0D0nvqDjPJK2WNLTs2LDwmC9Och3FA4VzMSSdBFwGHGtmiyNO+QdwTNn3x4THnOsoHiiciyDpX4DvAkeb2e9iTrsG+HjZ9x8H/qvifV4l6YeSnpG0StKFkoaEz+0p6Q5JayU9J+laSTuWvfZxSZ+X9ICkFyX9VNKI8Lkxkn4l6QVJz0taIsn/f3ZN4X9YzvV3OvA1YJqZLUs475fA4ZJ2lLQTMAWYX3HOj4CtwOuBtwDTgVPD5wR8A5gA7ANMBM6veP2HgHcBrwP2B04Oj38OeAoYC7wamA34kJdriqHVT3Fu0DkKuBNYUeW8V4AbgRMIbvoLwmMASHo18G5gRzN7GXhJ0iXALOAHZvYo8Gh4+hpJFwNfrbjGpWb2dPh+NwIHhMe3AOOB14bvs6SeH9S5NLxH4Vx/pwN7AVcq8JCkDeFjSsW5/0Uw5NRv2Al4LTAMeCYcInoB+AEwDoJAIukn4ZDUOuC/gTEV7/Fs2dcbgVHh1xcRBJmFkh6TdN6AfmLnEniPwrn+/g5MAxYD3zezf0o4dwnBJ3sDfgPsWfbck8AmYIyZbY147ZzwdfuZ2fOS3gf8R5oGmtl6guGnz0l6E3CHpKVmtijN652rhfconIsQDvdMA94VDhfFnWfAe4EZVlGz38yeARYC35U0WlIhTGCXptp2ARuAFyW9Bjg3bfskvUfS6yUJeBHoAYo1/IjOpeaBwrkYZvYE8E7gA5K+kXDeQ2b2UMzTHweGAw8TTJ39OUEPBOAC4ECCG/1NwPU1NO8NwO0EgeZ/CXo+d9bweudSk29c5JxzLon3KJxzziXyQOGccy6RBwrnnHOJPFA455xL1JHrKMaMGWOTJk3KuhnOOdc2li9f/pyZjY16riMDxaRJk1i2LKlEj3POuXKS/hb3nA89OeecS5R5oJD0uXATmMoaN6XnT5L05/BxUqvb55xzg12mQ0+SJhKUXX4i5vmdCappTiaoibNc0gIz881hnHOuRbLuUVwCfIH4OvpHA7eZ2fNhcLiNoDa/c865FsksUEg6DlhlZvcnnPYaggqcJU+Fx6Leb5akZZKWrVmzpoEtdc65wa2pQ0+Sbgd2jXjqywQ7ck1v1LXMbC4wF2Dy5MlewMoBMH/lI1z0uyU8s34947u6OPfQKRy39z5ZN8u5ttLUQGFmR0Ydl7QfwdaO9wdVktkNuE/SIWZWvlHLKuCIsu93A+5qSmNdx5m/8hFmL1rIy1uDrSCeXr+e2YsWAniwcK4GmQw9mdkKMxtnZpPMbBLBkNKBFUEC4FZguqSdwj2Jp4fHnKvqot8t6Q0SJS9v3cpFv/NdQ52rRdbJ7H4kTZZ0JYCZPQ98HVgaPr4WHnOuqmfWr6/puHMuWi5WZoe9itLXy4BTy76fB8zLoFmuzY3v6uLpiKAwvqsrg9Y4175y16NwrlHOPXQK2w/t+1lo+6FDOffQKRm1yLn2lIsehXPNUEpYt9OsJ5+l5fLIA4XraMftvU/b3Gh9lpbLKx96ci4nfJaWyysPFM7lhM/ScnnlgcK5nIibjeWztFzWPFA4lxM+S8vllSezncuJdpyl5QYHDxTO5Ug7zdJyg4cPPTnnnEvkgcI551wiDxTOOecSeaBwzjmXyAOFc865RD7rybUtL6DnXGt4oHBtZ/7KR7jgrkW8sGlT7zEvoOdc8/jQk2srpQqr5UGixAvoOdccHihcW4mqsFrOC+g513geKFxbqRYIvICec43ngcK1laRA4AX0nGsODxQNMn/lIxx21Vz2vPS7HHbVXOavfCTrJnWkqAqrADuNGMGcadM9ke1cE/ispwbwLSxbxyusOtd6HigaIGkLS7+BNZ5XWHWutXzoqQF8C0vnXCfzQNEAvoWlc66TeaBoAN/C0jnXyTxH0QCeYHXOdTIPFA3iCVbnXKfKNFBI+hzwHWCsmT0X8XwPsCL89gkzm9HK9jlXL69s6zpJZoFC0kRgOvBEwmkvm9kBLWqScw3h62pcp8kymX0J8AXAMmyDcw2XtK7GuXaUSaCQdBywyszur3LqCEnLJN0t6X1V3nNWeO6yNWvWNK6xztXI19W4TtO0oSdJtwO7Rjz1ZWA2wbBTNa81s1WS9gDukLTCzP4SdaKZzQXmAkyePNl7KS4z47u6eDoiKLTjuppF3UuYN7ubNU+uZezEXThlzkymzfRp34NN0wKFmR0ZdVzSfsDrgPslAewG3CfpEDN7tuI9VoX/PibpLuAtQGSgcJ2lnZPB5x46pU+OAtpvXc2i7iV8/6yrWLd2W8Bb/cRzXDLrCgAPFoNMy4eezGyFmY0zs0lmNgl4CjiwMkhI2knSduHXY4C3Aw+3ur2u9UrJ4KfXr8fYlgxul4q8x+29D3OmTWdCVxcCJnR1tVVl20XdS7hk1hV9gkTJpo2bmTe7O4NWuSzlah2FpMnAaWZ2KrAP8ANJRYKA9k0z80AxCHRCkcV2Xlczb3Y3mzZujn1+zZNrW9galweZB4qwV1H6ehlwavj174D9MmqWy1Dek8HFjQtgw8VQfAYK42HUORRGds4Sn2qBYOzEXVrUEpcXXuvJ5U5eiyzOX/kI59/yWTY+fx4UnwYs+HfdV4Lg0SGSAsF2I4dzypyZLWyNywMPFC538lhksZQ3+b9v+DUjh26tePaVoIfRIU6ZM5PtRg7vd7xr51GcPfc0T2QPQpkPPTlXKesii1Ezrkp5k/EjN0S/qPhM4uvbKV9RCgQ+LdaVyKzzlhxMnjzZli1blnUzXBuqLL8BQW+m9P3iY6/lNTtEBIvCBArj7op9fbVZT52e93D5J2m5mU2Oes6HnpwrEzfjakiw5ofvPHAIG7dWdsRHwKhzEl+fVL6juHEBrPtKR+c9XHvzQOFcmbiZVT1mbD90KDc++Qa+vPRwVr00iqLBxuJYGH1h76f/umZsbbgYeKXiYGflPVx7iw0UkvYLayw9KWmupJ3Knru3Nc1zrrXiZlaVFs1N6OriV0++gY/8+jRuWjefURN+22eIqK4ZW2X5jVTHnWuxpGT25cD5wN0Eaxt+I2lGWGtpWAva5lzLJZXfSLOIrq7yHYXx4bBTxHHnciBp6KnLzG4xsxfM7DvAp4FbJL0NLw3uOtRAy2/U9fpR5wAjKg5uy3u0wqLuJZw46XSmD/kQJ046nUXdXhLdbRM760nS/cDhZvZi2bH9gV8AO5tZbpdn+qwn1yqNmgqb5aynUm2n8rId240c7msmBpl6Zz19i6DeUi8zewCYBlzfuOa1t+LGBRRXH0Hx2b2Df32myqDRyOKFhZEzKIy7i8KuK4N/Wzg1Nqq2U1Txv0vPuJKjh53AUYUPcvSwE7j0jCtb1kaXrdhAYWbdZnZ3xPEnzOyTzW1We/BpjYNbp+xkF1fbqfz4pWdcyY2X30qxpwhAsafIjZff6sFikPDpsQPh0xoHtbRTYeevfITDrprLnpd+l8Oumtuyculp8w5xtZ3Kj98097bIc+KOu87igWIgfFrjoJZmKmxWe2uU8g6rn3gOM+vddCgqWETVdqos/lfqSVQqHfdkeGerGigkvT3NsUEpbvqiT2scFNIUL8xqeCpt3gGC2k5nzz2NcbuPQRLjdh/TL5FdGBJ9qygMKdQUlFx7StOj+F7KY22trqR0DqY1uuykmQobNTz13ol/5seHX9HUCRBp8g7lps2cwrWPX87Cnuu49vHL+812OnbWUZGvO3bWUTUFJdeeYhfcSfpn4FBgrKTyO99oYEizG9ZKvUnpUr6hlJSGfrNP+k1jHHE8bF7sxdwGqWqL8MZ3dfF0WbB478Q/868Hl5UqT/hbG4ixE3dh9RPPRR6vx5mXnQoEOYliT5HCkALHzjqKMy87lelXLIx8TWVQWtS9xCvStqmkHsVwYBRBMOkqe6wDPtD8prVQyqR05CynV24IgkMG0xpd/lUOT31+/3tT72cxkKnXafIOtTrzslO5dctPua34M27d8tPe4JEmGe7DU+0taXrsYjO7AHibmV1Q9rjYzP7cwjY2X9qktM9ycjWqHJ5Ks58FDHzqdZq8Q6OkCUo+PNXe0mxctJ2kucCk8vPN7J3NalTLpa2147OcXB3Kh6eKq29M97eW9KEkZa912swpLRnaSbPRUa05E5cvaQLFz4ArgCuBnuY2JyOjzumbowAik9JevM0NVNq/tTb7UFItKDU6Z+JaK82sp61mdrmZ3Wtmy0uPpreshQojZ8DoC6EwAVDwb9keA718lpMboNR/ax029boZORPXOml6FDdK+hRwA7CpdNDMnm9aqzJQGDmjape+MHIGRfAtK92ApPlbi+p5vLJRzPvmjuxzxJKWzhZqxGwl34e7vVXdM1vSXyMOm5nt0ZwmDZxXj3V5MZCqsMWNC3j5799gu+3WsmbVMK76xq7c+cudW1rZ1SvLDh5J1WOrBop25IHC5UG/9TkAjIgeaopx4qTTI8f2x+0+hmsfv7wxDc3x9V3r1FtmvPTikZK+Es58QtIbJL2n0Y10LgtNLRPfgOnUaWYLNbPOUrXre42nwSFNMvsqYDPBKm2AVcCFTWuRcy3S9DLxDZi5VG0xW7MXsiVd3xfRDR5pAsWeZvZtYAuAmW0E1NRW5ZhvVNRBmr2AMnaGkqX+24maLYTgrcceBDR/IVvSbCVfRDd4pAkUmyVtT7hPtqQ9KZv9VA9J50taJekP4ePdMee9S9JKSY9KOm8g12wE36iowzR7rcKoc4BhMddI97czbeYUpp80te9HM4OFV9/Jou4lTVvIVhpS+tbHvsfwEcMZvUtXvxXevohu8EgTKL4K3AJMlHQtsAj4QgOufYmZHRA+bq58UtIQ4DLgGGBf4COS9m3AdetX4ydQ733kXEvWKiRNFknXe7nnpuX93qb0yT1NnaVaVQ4prX9+A5te3sQXr/lMn8qyzbi2y6eqgcLMbgPeD5wM/BiYbGZ3NbdZABwCPGpmj5nZZuAnwHEtuG68Gj6Beu+jDTR4AWXlBwPWXQhUFgCsfFH13kvSJ/dmLGRLO6Tki+gGj7Q73I0A/kFQOXZfSYc34NqflvSApHmSdop4/jXAk2XfPxUey04tn0C9gGDupV4lnULkBwNeSPHK6vmKpE/uzSj+l3ZIqZWFB1220iy4+xZwAvAQUNoP0cws8f8mSbcDu0Y89WXgbuA5gg7114HxZnZKxes/ALzLzE4Nv/8Y8FYz+3TM9WYBswB23333g/72t78l/lz1qGVefPHZvYkedhCFXVc2vG0uW8XVR0TXAUstfn1Fqxe9xa2dKAwpYEXzVdUdKmkdRZoSHu8D9jazmhLYZnZkysb9J/CriKdWARPLvt8tPBZ3vbnAXAgW3KVvaXo1lfDwAoKDy4AT4PGVYVtd/uKUOTP7BSbYtj92aRpsedtcZ0sTKB4jmLoxoJlO5SSNN7PS/1nHAw9GnLYUeIOk1xEEiA8DmQ9+pqrTAzFVQoHh72hKu1xj1Vx6I+6DQT/DCGeaR1w0Pti0qmR46VqwLTCpoN4gUVLKWXigGBzS5Cg2An+Q9ANJl5YeA7zutyWtkPQAMBU4G0DSBEk3A5jZVuDTwK3AI8B1ZvbQAK/bEr03mX45CuCVGzyhnXN1TUSITIxH2ULsTsI56m2W76FtxegOep6mwfoK8eZK06NYED4axsw+FnP8aeDdZd/fDPSbOptH2z6BVvtUWdvmMy4DdWwaFDksGfu30EMQVKrsSZETed9LojKH40NjjZdmeuzVBNNil4eP7vCYC/X9BJrmBfncfMaF6lyIVxg5I9g3Pdw/PZhNFWUIQZAIexYDmG3VCo2cBlvvJ/9F3Ut4/5hPcFThgxxV+CD/Z+wpva/1FeLNV7VHIekI4GrgcYL1oRMlnWRmv25u09pI3DBTLKP47D5AT3CT8D0t8qVRExHi8lS9G0WGPYuc//efNnMKD/12JTfNvY1iT5HCkALTT5pa86f1ej/5L+pewkWfuIyeLds22Fy3dj3fOeX7gG+z2gppchTfBaab2TvM7HDgaOCS5jarzdTVQwj/6H0hXv40aCFev3UakbmJ5qytaeSY/aLuJSy8+s7ehHaxp9hbQqQW9X7ynze7u0+QKNm6eWvTVqe7vtIEimFm1jvx38z+RGwBm0FqwElIX4iXJ41ciFc+HLVtGVKFBg9FDqSqa1SAadTQTr0ly5N6Bs1ane76SpPMXibpSuC/w+9PBHxXoHKRQwwiuc5PBc9b5ErqadA1vWlz1tZUblX68oZXYm/slUM85a/t2nkUL63b2PvpvRRgKt+rpNahnaSkeNKwVNzrSs/5NqvNl6ZHcTrwMHBm+Hg4POaonApbnpy8KCGZGaXgxQM7XYNrS0F072H98xsiz628sVe+dt3a9f2GeDZt3ExhSPRtQgXVNPxUb8nyU+bMZMiw/sN2Q4cP7e01lE/nLS9c6BojzaynTcB/ABcQVJK9rNZV2p2q/2yniuRkTb2EHrx4YGdr5JBWSdQNNk7lmH3a1xZ7iv33xAiP17JRUVJtqKRhqWkzp3DuVWfQtfOo3uOjd+ni8/M+5QGhRdLUejoWuAL4C8F4yuuAfzGz/2l+8+rTqj2zY+v7FCZQGHdXyvo/Q9g2C6b/eziXZPqQD5F23/uunUdxxqWn9N5c0762MKTAsbOO6p31VKkR+2f73tzZG9Ce2QSznqaa2RFm9g6CldQ+6wmqz7evulp3BJFBIum9XUerdQ+TuJk9o3fpYvQuXX2OrX9+Q58eQNpZQaVZTlFBAhozDdUT0vmWJlCsN7NHy75/DFjfpPa0lypbXQLBUAM7xpxXltdI/d6uU9VTOiTuBvupf/8EI3bYrt/55bOVol47dPjQyI2Ok3IVjZiG6iXL8y1NoFgm6WZJJ0s6CbgRWCrp/ZLe3+T25VtSjyH8nxyAwsiENymVcyiX33IOronq2MOk3nH/8mmvpQAwbvcxfH7ep1BUpCA6V9HIT/2ekM6vNDmKqxKetsp9JPKgVTkKSFHjqTAhHEaK+T2HK7NrqlTq2lpcZdpG72ESN+7ftfMoNr+yOXZ/i6R8QWmGkk9D7TxJOYqqgaIdtTJQlCT9Tx5fIE4w+iIPCoNI0uZXsR846pzYELfh0Xbbb8e6tf1Hj0uJ41ZvlOTyYUDJbEmvk3SxpOslLSg9Gt/MNpe0TWrcENWIDycGiVoTm64NJA0vNWCdRfnq5nmzu5l+0tR+w1LV1lm0a77AS403T5qhp/uBHwIrKKtBYGaLm9u0+mXSo6iyTWqtG+HUsu2qax/Vhpdq3jCpTNqeQCdORfVe0MANaOhJ0j1m9tamtKxJsggUUMeuaEnvVWWNhmtPzfzvmjYAdOJNNc3PXlnqxPMrfQ10z+x/l/RVYCFl26Ga2X0Nal/HaGh9oDr3RHA5F1kXrDGz3NKW2+7E2khpCg765kb1SxMo9gM+BryTbUNPFn7vapS619GkAnIuW5E74VXpeab9JFzLTnSt3IO7Far97Em1pDrp99AsadZRfBDYI9yPYmr48CBRh5oWVDWhgJzLh8qd8KoFibQlwwfz6uZqP7tvbjQwaQLFg8QvLXa1qGFBVTMKyLn2U8teEO06WwkGPmOp2s/umxsNTJpk9l3A/sBS+uYocnvHyiqZXU2jF1S5zhdXuE8SC3uuy6BFjbWoewnfP+uqfus6Gp1c78QEfqMNNJn91Qa3p+N43sE1Sy15h3YTdfMuaXT+oBMT+K1UNVCY2WJJrwYODg/da2arm9us9tFvvUMp7wD9g0UTZ7y4znTKnJmRn4QHknfIYppo1DWr7YfR6PxBpyXwWynNyuwPAfcSJLU/BNwj6QPNbljbaFHewVdpD06NzjsMZD/tesVdM25705JO6DV1irQrs48q9SIkjQVuN7M3t6B9dWlpUcAmrrTtvYav0nYNksWq7LhrFoYUYve48PxB6w1046JCxVDT2pSvGxwSajzVs79ApDrKTzsXJYtponHvXewpoiH9S5p37TzKg0TOpLnh3yLp1nA/ipOBm4DcboNar7qHdpLWO9R4g49tg6/Sdg2SxTTRuPcescN2WE//3vgRJ7zdg0TOVA0UZnYu8AOCKbL7A3PN7AvNblgrDeSTf2LeoYYbfGIbkirTukFnIGsOsliUF3fNTS9HJ7Jvmntb5HGvDpud2EAh6fWS3g5gZteb2Tlmdg6wRtKeLWthKwxwaCd2pW3sjbzQv9fQ5PLTrjMMNBmdxaK8uGtaMTo/GpW3yCIJ77aJTWZL+hXwJTNbUXF8P2COmb237otK5wOfBNaEh2ab2c0R5z1OsD93D7A1LtFSqdZkdrMWwkUnoSuFm9asOzexDY2sTOvaVyeVCD962AmRQaEwpMCtW37a51gn/dx5VW8y+9WVQQIgPDapAe26xMwOCB/9gkSZqeE5qYJEXZo0tNNvWIohEWeFvYak3sfGBTXVB3Kdq5NqFh0766jI48NHDOvXU0hTHdaHpZonKVAk1XfavtENyVQTh3bKb/Bl+z71VXwmfhc8euqbKeU6UlxiWAW13U3yzMtO5b2nH40KfWc+vfLSpn7DSklJeB+War6kQLFM0icrD0o6FVjegGt/WtIDkuZJ2inmHAMWSlouaVbSm0maJWmZpGVr1qxJOrWflhXgS+i59LYhqdfhBr2oxDAE4/rteJM887JTGbtb/yBQWfjwlDkzGTq8byGJocOHxq7wjiuc6OqTFCg+C3xC0l2Svhs+FgP/Fzir2htLul3SgxGP44DLgT2BA4BngO/GvM1hZnYgcAxwhqTD465nZnPNbLKZTR47dmy15vXTkqGduF7D8Hf0tiGx1+EGvcrEcGFI//+F426SeR2eSTucVplPLX3fScNxeRUbKMzs72Z2KHAB8Hj4uMDM/tnMnq32xmZ2pJm9KeIxP3zvHjMrAv8JHBLzHqvCf1cDN8Sd1y4KI2fAiOMJ8hVlXrlh29CST4V1VUybOYVrH7+chT3Xxc4cqrxJ1jM8M5DAUstr06ztmDe7m54tPX2e79nSw7zZ3V5CvAXSrKO408y+Fz7uaMRFJZXf9Y4n2POi8pwdJHWVvgamR53XdjYvpv/sprKhJZ8K62qQ9iZZ6/DMQMb9a31tmrUdSb2GwbxhU0mze4tZleL4tqQVkh4ApgJnA0iaIKk0A+rVwG/CWlP3AjeZ2S3ZNDe94sYFFJ89hOKzewWPvx/SNxFdZRGeb1jkapH2Jlnr8EwtgaXyJvX9s66qKSilWduRFBDbecOmRmhFMr9qUcB2lNXGRcG6ifOArRXPDIPR3wj2S159RMyeFBMojLur+Y10HSdN2fCkwnxfuPrT/c5Pu2FS0p4S1V5bi0ZuPJRFmfVmatQak4FuXOTS2nAx/YMEwJbguZEzfE8K13Bp9lmI2tcCgtlSl8y6ovd9StJumFRtT4mk19aiURsPVQac0qfv8mu0m1Yk85NKeKyXtC7isV7Suoa1oJMkzUwqPkPxha/Cui/SJ0j40JJrgdLwTNpZUgMd0qrUiJxBeRL/2scvT7yxx43Zd+JU2lYk85NmPXWZ2eiIR5eZjW5YCzpJ4swkg1d+TFCNpMzwd3iQcA0XdaOcNnNK6llSacf9425GXTuPyixnkDRm34lTaVuRzE899CRpHGXTcczsiYa1olOMOicmR5HglesIZiA71xiXnnElN15xa+/kuvLhlVr24K53SGu7kcM549JTWhIY0m6xWuo1dOIe5K3YDzzNDnczCBbETQBWA68FHjGzf2pYKxosq2Q2lBLaFwIvpH5NYdc/Na9BblBZ1L2Eb37s0sj6kuN2HxN7Yx/o9qpZJIfjEtxxORNJfPGaz0Tmarp2HtWy4JZXA01mfx14G8H2p2+RNBX4aCMb2EkKI2fAyBkJFWkrRZXscK4+82Z3x/7ZrXlybVM+fabpeTRDXM8hbovV0lRagO+fdRXr1q7vfW798xtSJ7U7bdZUGmkCxRYzWyupIKlgZndK+remt6zdFcZHT4PtpyeYMutlw10DJI21l4ZXsrqx1yvuxpy0xWq/noXgrcceBAQ//7zZ3X0CBWwbnqr83ZRfv2vnUby0bmPvKvFOmDWVRpoFdy9IGgUsAa6V9O/AS81tVgeIrQYbocqOenVv0+oGndixdtGWK5WTEtNxP+u43ccw/aSpfSvlGCy8+s7e2U9pk9qV11+3dn2/UiLtPmsqjTSB4jjgZYIigbcAfwHq3rRosOi7whp6h5gKE0BRFdyjK8QOZJtWN/hEVpcVvPe0o9vyE29SYjppts89Ny3vNwRXfkMfSOmTKO08ayqNNLWeXgLGAu8GngeuM7PO/q00yLaKtH+isOsjwb/j7gJ7MfoF4TqM8h5Ev3UXgJcdd3GiprWed82ZnHnZqVVfm1V12aTrJn3yT5rCW63H0Oh1Iu08ayqNqjmKcP+J/wfcQdCZ+56kr5nZvGY3rmPF5S/0qojtU3v6nwdedtzFqicHkdWK5WrXrTadNe5nTfM6qJ7Uj3ufcoOhAGGaoadzgbeY2clmdhJwEPDF5jarw406BxjW/7htgPUXkrzHdsmrPG/hGiarFcvVrlvvYrI0r0uz0jvqfYYOH0rXzqMGVQHCNLOe1gLl0wPWh8dcnQojZ1CMXGuxFSzN+ouhwEtQDM8t5S3AZ065umS1YrnadeudztuoacCtWMzWDtIEikeBeyTNJ0gPHQc8IOkcADPzwfK6xOQpYg0BisGwlW2MCCivbCs86Aa9Wuf6Z7HYKpwAABGNSURBVLViOc11ax1Kq/zZv3jNZ9pqnUge12mkGXr6C/BLts0hmA/8FegKH64esXWhdiRy46LR3+rdprVaMtwNbvXsT5DV5j+Nvm4r9mZopry23/ejyEj/pDUEAeHC4MsNFwc3/sL4fovxfE8Ll6Te/QmyLMXRqOs2am+GrGTZ/rpKeEj6NzP7rKQbiSgKYGY+xlGn4sYF4fTWVwgmkoW/XgU9iVIZkFi+p4VLUG++Ie0QS6MDSiOHdtq9Omxe25+Uo7gm/Pc7rWjIYNG/J1EWg+2FVEnpwsgZFCGx1+EGr2bmG/K+8U+7V4fNa/uT9qNYHn65DFhiZovNbDHwG2BpKxrXkXp7EnHSLabbtpgvyFt4kHAlzcw35H3jn6xyLY2S1/anmfW0CDgS2BB+vz2wEDi0WY3qaGkSzp6UdgPQzCmdeR0aKWnH6ayVQ3nTT5rKPTctz1X70+xH8QczO6DasTzJczI7NhFdzpPSLqfaPVlcLg/TUOP21MhiEV9SMjvN9NiXJB1Y9mYHERQJdPWoWlXWk9Iuv/I6NFKrvExDzftQXkmaQPFZ4GeSlkj6DfBT4NPNbVbn6q0qS1QFWcGI4z3f4HIr7V7aeZeXG3Teh/JKquYozGyppDcCe4eHVprZluY2q3NtmxobVarDYPPiVjfJuZq028ZHUfJyg87rLKdKaXoUAAcD+wMHAh+R9PHmNalz9d1bIu4kT2Q712xp96NotnYZyqsaKCRdQ7CW4jCCgHEwEJnwcFVUnRpLQmkP51yj5OUG3S5DeWmmx04G9rVOrPXRaqmmxm6kuHGB5ymca6I8TaNth6G8NIHiQWBXwMdEBipuw6I+0q3Ods4NTDvcoPMiTY5iDPCwpFslLSg9BnphSZ+R9EdJD0n6dsw575K0UtKjks4b6DUzFzk1VhEn+lanzrn8SNOjOL/RF5U0lWBfizeb2SZJ4yLOGQJcBhwFPAUslbTAzB5udHtaJbJGU1wPw5PaLiN5WIjm8iXN9NhmzNc8HfimmW0Kr7E64pxDgEfN7DEAST8hCC5tGyigf2XY+JLhntR2rZf3on951skBNnboKVxch6T1ktaVPdZLWjfA6+4FTJF0j6TFkg6OOOc1wJNl3z8VHotr7yxJyyQtW7NmzQCb10KRw1G+OttlIy8L0dpNXlZ6N0tsj8LMDgv/rWsXO0m3EyTBK305vO7OwNsIptteJ2mPgcysMrO5wFwIaj3V+z6t5iXDXZ7kZSFau0kKsJ3Qq0gcegrzBA+Z2RtrfWMzOzLhfU8Hrg8Dw72SigRJ8/KuwCpgYtn3u4XHOk7VjYqca5F2WSmcN50eYBNnPZlZD7BS0u4Nvu4vgakAkvYChgOVf51LgTdIep2k4cCHgQHPtmpnxY0LKK4+guKzewf/bhzUvw7XBHlZiNZu8rLSu1nSTI/dCXhI0qIGTo+dB+wh6UHgJ8BJZmaSJki6GcDMthIUH7wVeAS4zsweGuB121bf8h8W/LvuKx4sXEO1y0rhvOn0AJtmP4p3RB1v0myohsjzfhT1ip8d5XtXOJcH7T7rKWk/itgchaQRwGnA64EVwA/DT/kuC3HrKny9hXMtkxQMOnmld1Iy+2pgC7AEOAbYFzirFY1yEeIW5/l6C+daYjCvMUnKUexrZh81sx8AHwA6+zfRAsWNCyj+/RCKz+4VPg5Jn2Pw9RbOZWowrzFJ6lH0bk5kZlulqJpELq0gGf0lyn6tBAUAz0tVANDXWziXrU6fApskKVC8uWwFtoDtw+8FmJmNbnrrOsmGi+kbJEq2Bs+luOH7egvnsjOY15jEDj2Z2RAzGx0+usxsaNnXHiRqlZR09oS0c7nX6VNgk6TdCtUNVFLSuY6EtC++c661BvMakzRlxl0jjDonIkcBMLTmhHTv4rvStqqlxXf4ZkfONVMnT4FN4j2KFimMnAGjvwHasezojjD6m7Xf3CP33vbNjpxzzeE9ihZqWDLaF98551rIexTtKC6n4YvvnHNN4IGiHfniO+dcC/nQUxvyxXfOuVbyQNGmfPGdc65VfOjJOedcIg8UzjnnEnmgcM45l8gDhXPOuUQeKJxzziXyQOGccy6RBwrnnHOJPFA455xL5IHCOedcIg8UGfBNh5xz7cRLeLSYbzrknGs33qNoNd90yDnXZjxQtJpvOuScazMeKFrNNx1yzrUZDxSt5psOOefaTGaBQtJnJP1R0kOSvh1zzuOSVkj6g6RlrW5joxU3LijLUQwJDhYmwOgLPZHtnANgUfcSTpx0OtOHfIgTJ53Oou4lWTcpm1lPkqYCxwFvNrNNksYlnD7VzJ5rUdOapt9sJ3oo9SQ8SDjnIAgSl8y6gk0bNwOw+onnuGTWFQBMmzkls3Zl1aM4HfimmW0CMLPVGbWjdXy2k3Ouinmzu3uDRMmmjZuZN7s7oxYFsgoUewFTJN0jabGkg2POM2ChpOWSZiW9oaRZkpZJWrZmzZqGN3jAfLaTc66KNU+urel4qzRt6EnS7cCuEU99ObzuzsDbgIOB6yTtYWZWce5hZrYqHJq6TdIfzezXUdczs7nAXIDJkydXvk/2CuODxXVRx51zDhg7cRdWP9F/pH3sxF0yaM02TetRmNmRZvamiMd84CngegvcCxSBMRHvsSr8dzVwA3BIs9rbdD7byTlXxSlzZrLdyOF9jm03cjinzJmZUYsCWQ09/RKYCiBpL2A40CeMStpBUlfpa2A68GCL29kwhZEzYPSFwSwn5LOdnHP9TJs5hbPnnsa43ccgiXG7j+HsuadlmsgGUP/RnhZcVBoOzAMOADYDnzezOyRNAK40s3dL2oOgFwHBUFW3mf1rmvefPHmyLVvW9rNpnXOuZSQtN7PJUc9lMj3WzDYDH404/jTw7vDrx4A3t7hpzjnnKvjKbOecc4k8UDjnnEvkgcI551wiDxQ55bvgOefywne4yyHfBc85lyfeo8gjrwvlnMsRDxR55HWhnOsoeSwdXgsfesojrwvlXMfIa+nwWniPIo+8LpRzHSOvpcNr4T2KHCqMnEERgpxE8ZmgJ+EbHDnXlvJaOrwWHihyqjByBnhgcK7t5bV0eC186Mk555oor6XDa+E9Cueca6JSwnre7G7WPLmWsRN34ZQ5M9smkQ0ZlRlvNi8z7pxztUkqM+5DT8455xJ5oHDOOZfIA4VzzrlEHiicc84l8kDhnHMuUUfOepK0Bvhb+O0YoP9ql3zytjZPO7W3ndoK7dVeb2u815rZ2KgnOjJQlJO0LG7KV954W5unndrbTm2F9mqvt7U+PvTknHMukQcK55xziQZDoJibdQNq4G1tnnZqbzu1Fdqrvd7WOnR8jsI559zADIYehXPOuQHwQOGccy5RRwYKSRMl3SnpYUkPSTor6zYlkTRC0r2S7g/be0HWbapG0hBJv5f0q6zbkkTS45JWSPqDpNyXFJa0o6SfS/qjpEck/XPWbYoiae/wd1p6rJP02azbFUfS2eH/Ww9K+rGkyr2Gc0XSWWFbH8rD77UjcxSSxgPjzew+SV3AcuB9ZvZwxk2LJEnADma2QdIw4DfAWWZ2d8ZNiyXpHGAyMNrM3pN1e+JIehyYbGZtschK0tXAEjO7UtJwYKSZvZB1u5JIGgKsAt5qZn+rdn6rSXoNwf9T+5rZy5KuA242sx9l27Jokt4E/AQ4BNgM3AKcZmaPZtWmjuxRmNkzZnZf+PV64BHgNdm2Kp4FNoTfDgsfuY3gknYDjgWuzLotnUTSq4DDgR8CmNnmvAeJ0DTgL3kMEmWGAttLGgqMBJ7OuD1J9gHuMbONZrYVWAy8P8sGdWSgKCdpEvAW4J5sW5IsHMr5A7AauM3M8tzefwO+ABSzbkgKBiyUtFzSrKwbU8XrgDXAVeGw3pWSdsi6USl8GPhx1o2IY2argO8ATwDPAC+a2cJsW5XoQWCKpF0kjQTeDUzMskEdHSgkjQJ+AXzWzNZl3Z4kZtZjZgcAuwGHhN3P3JH0HmC1mS3Pui0pHWZmBwLHAGdIOjzrBiUYChwIXG5mbwFeAs7LtknJwuGxGcDPsm5LHEk7AccRBOIJwA6SPpptq+KZ2SPAt4CFBMNOfwB6smxTxwaKcKz/F8C1ZnZ91u1JKxxquBN4V9ZtifF2YEY49v8T4J2S/jvbJsULP01iZquBGwjGffPqKeCpst7kzwkCR54dA9xnZn/PuiEJjgT+amZrzGwLcD1waMZtSmRmPzSzg8zscOAfwJ+ybE9HBoowOfxD4BEzuzjr9lQjaaykHcOvtweOAv6YbauimdmXzGw3M5tEMORwh5nl8tOZpB3CyQyEQzjTCbr1uWRmzwJPSto7PDQNyOUEjDIfIcfDTqEngLdJGhneG6YR5C1zS9K48N/dCfIT3Vm2Z2iWF2+itwMfA1aE4/4As83s5gzblGQ8cHU4e6QAXGdmuZ522iZeDdwQ3BsYCnSb2S3ZNqmqzwDXhkM6jwGfyLg9scLgexTwL1m3JYmZ3SPp58B9wFbg9+SoPEaMX0jaBdgCnJH1pIaOnB7rnHOucTpy6Mk551zjeKBwzjmXyAOFc865RB4onHPOJfJA4ZxzLpEHCpcrknrCaqQPSvpZWMIg6rzf1fn+kyVdOoD2bYg5vqukn0j6S1gu5GZJe9V7nTyQdISkyIVpkt4o6X8lbZL0+Va3zbWWBwqXNy+b2QFm9iaCypmnlT8ZFnXDzOpaWWtmy8zszIE3s0+bRLDq+y4z29PMDgK+RLCOo50dQfwK5ueBMwlqKLkO54HC5dkS4PXhJ9slkhYQrlQufbIPn7urbA+Ha8MbN5IOlvS7cJ+PeyV1hef/Knz+fEnXhJ+M/yzpk+HxUZIWSbpPwV4Wx1Vp51Rgi5ldUTpgZveb2RIFLgp7SCsknVDW7sWS5kt6TNI3JZ0YtnOFpD3D834k6QpJyyT9Kay1VdrD5Krw3N9LmhoeP1nS9ZJuCX+mb5faJGl6+LPeF/bWRoXHH5d0QdnP+0YFxTRPA84Oe3hTyn9gM1ttZksJFoS5DtepK7Ndmwt7DscQFEWDoObRm8zsrxGnvwX4J4LS0b8F3i7pXuCnwAlmtlTSaODliNfuD7wN2AH4vaSbCCr4Hm9m6ySNAe6WtMDiV6e+iWDPkyjvBw4A3gyMAZZK+nX43JsJSko/T7AK+0ozO0TBRlufAUob1kwiqFG1J3CnpNcDZxBUqN9P0hsJKuSWhroOCH8nm4CVkr4X/uxfAY40s5ckfRE4B/ha+JrnzOxASZ8CPm9mp0q6AthgZt5rGOQ8ULi82b6s7MoSgppdhwL3xgQJwueeAghfOwl4EXgm/NRLqXpw2NkoN9/MXgZelnQnwQ35JmCOgkqzRYK9TF4NPFvHz3MY8GMz6wH+LmkxcDCwDlhqZs+E7foLQbVQgBUEvZSS68ysCPxZ0mPAG8P3/V74s/1R0t+AUqBYZGYvhu/7MPBaYEdgX+C34e9gOPC/ZdcoFc5cTsZ7H7j88UDh8ublsNx6r/DG9lLCazaVfd1DbX/Xlb0EA04ExgIHmdkWBZVyk7bOfAj4QA3XLClvd7Hs+yJ9f4aoNqZ939LvQwT7nHykymtq/f25QcBzFK5TrQTGSzoYIMxPRN0AjwvH+3chSN4uBV5FsOfGlnDs/7VVrnUHsJ3KNkaStH84rr8EOEHBxlRjCXawu7fGn+WDkgph3mKP8GdbQhDQCIecdg+Px7mbYEju9eFrdlD1WVnrga4a2+o6kAcK15HMbDNwAvA9SfcDtxHdK3iAYP+Pu4Gvm9nTwLXAZEkrgI9TpeR7mLs4HjhSwfTYh4BvEAxV3RBe436CgPKFsJx4LZ4gCC7/Q7B38ivA94FC2MafAieb2aa4NzCzNcDJwI8lPUAw7PTGKte9ETg+KpmtYDrwUwR5jq9IeirMA7kO5NVj3aAl6XxynqyV9CPgV2b286zb4gYv71E455xL5D0K55xzibxH4ZxzLpEHCuecc4k8UDjnnEvkgcI551wiDxTOOecS/X/bQVyIxiOM6AAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "Plot().plot_in_2d(X, kmeans.clusters, \"K-Means\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "7cwctM0Bz3Tm"
   },
   "source": [
    "You can find a good comparison of [clustering methods in the scikit-learn documentation](https://scikit-learn.org/stable/auto_examples/cluster/plot_cluster_comparison.html#sphx-glr-auto-examples-cluster-plot-cluster-comparison-py).\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "marketing segmentation",
   "provenance": [],
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
